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A. BACKGROUND 


One of the most important resources of an Sryanization 
ani one that is too often overlooked is data. People, 
dollars, materials, and time are usually well controlied and 
budgeted, yet the data about an organization and its opera- 
tions is often managed haphazardly, if at all. 

Database technology has made possible the storage and 
processing of an organization's data as an intagrated whole 
and allows the sharing of that processed data, or informa- 
tion, throughout the organization. A database management 
system (DBMS) acts as a librarian for the database, storing 
and retrieving data according to a particular tornat 
(Ref. 1]. However, a DBMS does not aecessarily provide for 
“hə security, integrity, accountability, or maintainability 
of data. These objectives are best achievei when a data 
deti onary is use] in conjunction with the DBMS. 

Simply stated, a data dictionary is a central repository 
of descriptive data about the definition, characteristics, 
location, and usage of the data found in an organization. A 
A Utilizes. data dictionary will control the collection, 
maintenance, and retrieval of this data. For example, if 
Bec armeratt-^ “carrier U.5.S5. WSC tion nada gata 


dictionary, it would be possible to ask questions such as 


hat type of data is Contained mna a ttontrolied 
Equipage" record? 


How many programs use the "Personnel" file? 


ihich departments receive the "Ammunition Transaction" 
be pont: 


Ahat is the relationship between "Inventory Item" and 
wReorder Pornt"? 


In which recoris is the Eield “Soegra eSecurity eae 
Fouad 


Who is authorized to update he Skeadiness §95 ca 72 
fe td 


Nhat is the range of values for "Readiness Status" data? 


EE AC database is the "Preventive Maintenance" file 
ound? 


Those who will benefit from the answars to tasse questions 
include not only the shipts data administrator, but aiso 
programmers, systems development personnel, data processing 
staff, auditors, and, most important, end users at every 
level of the organization. 

Even though data dictionary software has been available 
commercially since 1970 and the advantages and benerits 
associated with data dictionaries are widely recognized, 
most organizations have been slow to iaplement them, and the 
Department of Defense 15 ad exception. A recent study bv 
the Committee on Review of Navy Long-Range Automatic Data 


Processing Planning [Ref. 2] points out that 


Virtually every action by a commander, manager, , OF 
idministrator in the Navy, as in any large organizations 
involves tha» acquisition and unlerstaniing of informa- 
tron: information about the O about its 
status, about its resources, about its environnanent. Mis 
actions usually result in the creation and promulgation 
of policies and directives: that ts, “anfogmatroen 252 
subordinates, peers, or superiors. 


If it is true that "the benefit decived from a dictionary is 
proportional to the size of the distionany rtssit," [Recien 
the military stands to gain a great deal from the implenen- 
tation of data dictionaries. 

At present, there is no consensus in coüputing litera- 
ture about exactly what a data dictionary should do or wkat 
kiad of data dictionary is best for a particular organiza- 
tions There ace many different data dictionary packages on 


the market from which to choose; most of tnes2 have similar 
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RENTES. Taereroce, pursoreugcral —»urcnsssr ot a data 
dictionary is in need of guidance when snaking this choice. 
The Uniteá States Government has recognized this problem ani 
has identifiel standards for data dictionaries in Federal 
Information Processing Standards promulgated by the National 
Bureau of Standards. An understanding of these standards 
umo: tie LINctions ahd objectives of a data dictionary 
will provide the reader with a basis on which to evaluate 


data dictionacy packages and to usea them effectively. 


B. PURPOSE OF THE THESIS 


We believe that it is important for managers in the 
military to understand what a data dictionary is and what it 
can do to help an organization manage its data. mns the 
pucpose of this thesis is to provide the raader with an 
wen standing of the structure and functions of a data 
dictionary, guidelines for the evaluation and selection of a 
data dictionary, and an analysis of several conmercial data 
Wc tionary products. We will show the reader how the 
management of an organization's data resource can be accoaz- 
plished by means of adata dictionary and will recommend 


ways for the cole of the data dictionary to be 2xpanded. 
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II. THE ANATOMY OF A DATA DICTIONARY 


A. INTRODUCTION 


Because data dictionary technology is a new and continu- 
ally evolving field, it suffers from a lack of consistency 
in its termigovogy. The many texts and articles on the 
subject and the various commercial data dictionary products 
use a wide vaciety of differing terms. The data dictionary 
itself is known as a data dictionary/1irecto e 
diztionary system, or an information resource management 
dictionary. In order to provide a base of reference for the 
remainder of this thesis, we will present our own set of 
definitions distilled from our ref2rences. 

Data dictionaries run the gamit from manual, on~paver 
systems to highly sophisticated softwar2 and can be used 
both in database and non-database environments. We wiil 
discuss automated data dictionaries only as they relate to a 
database, where they have the most to offer the potential 
user. 

In order to assess the benefits of a lata dictionary, it 
is necessary t5 understand how a data dictionary is orga- 
nized and what its capabilities aras. A data dictionary does 
not contain the actual data that constitutes an organiza- 
tion's database; instead, it is itself a dataosoase cailed a 
metadatabase that contains metadata, or data about the data- 
base data. Two types of metadata are found ina data 
dictionary. Dictionary metadata təlls what data exists, the 
origins of the data, the attributes the data aay have, how 
and by whom the data may be used, what the structure of the 
data is, and what the relationships between the data are. 


Dicectory metadata tells where the data is located, how it 


T2 


caa be accessed, and what its physical representation within 


the computer is. Together, these two types of metadata 
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Figure 2.1 Types of Data Dictionary Metadata 


provide the means for accessing aad controlling the data in 
the database. Figure 2.1] illustrites this division of 
metadata. 

Data dictionaries Fill Into two cat2jories--free- 
standing and DBMS-dependent. roue 2.2 Shes, a partial 
ERStung. of some commercial data distionacy packages 
according to type. IO Stan toes ct LONary (also 
called independent or stand-alone) is not tied to any 
particular database management system (DBMS). It manages 
data by utilizing software routines built into the data 
dictionary package ard thus is not dependent on DBMS soft- 
wace. This independence provides fleaxibility: a free- 
EHmdgoungedatumdretuonarv can have tha capability to support 
more than ore type of DBMS. Howevec, this flexibility is 
Gmemed at the zost of duplication of dita descriptions in 


the database and the data dictionary. 
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| Free-standing Data Dictionaries 
| 
| DATA CATALOGUE 2 (1974) | 
E PERSE ME Corporation | 
| DATA DESIGNER (1975) | 
i - Database Design, Inc. | 
| PRIDE-LOSIK (1974) | 
| > o ryce té Associates, Inc. | 
| DATAMANAGER (1975) | 
| - Management Systems $6 ?rogramming, L?D | 
DBMS-Dependent Data Dictionaries | 
ADABAS (1978) , | 
- Software AG of North America, Inc. 
DATA DICTIONARY/DATACOM (1979) | 
] aA. BOIS Data Research (ADR) | 
| ORACLE p 989) | 
- Relational Software, Inc. | 
DB/DC DATA DICTIONARY (1974) 
- International Business Machines | 
EDICT (1276) 
= {nfodata Systens, Ee. | 


Figure 2.2 Free-standing and Dependent Data Dictionaries 


A DBMS-dependent data dictionary (also called merged or 
integrated) is a component of a specific database management 
system; it uses the software facilities available within the 
DBMS to manage the data in the database. This type of data 
distionary minimizes redundancy and limits tae number of 
possible errors because data descriptions exist in only one 
place, an thS data dictionum It also benefits from the 
sophisticated backup and recovery facilities of the DPS. 

A data dictionary is also described as having active or 
passive interfaces or a combination of the two. An inter- 
faze is a series of commands which connect the data 
dictionary with other software such as compil2rs, operating 
systems, report generators, and other prcograns. The data 
dictionary supports these applications by providing the 
metadata that is required for their execution. An active 
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eE cionary LS one in which information is created, 
Ec -5sedn OF “ROdified through the data dictionary inter- 
faces. New or changed metadata is automatically updated and 
Seared in ths data dictionary. This is not true of a 
passive data dictionary: when new metadata is generated, 
the data dictionary may or may not be automatically updated 
and when data is retrieved, it may be accessel through the 
data dictionary or directly from the database. 

There are many perspectives from which to look at the 
data that resides in a database. There is the physical (or 
internal) view that consists of the actual physical repre- 
sentation, format, and location of the data as "seen" by the 
computer. RELE RES MaS ica 10. (or conceptualgEor —:i0bal 
enterprise) view called a schema which iescribes all of the 
data in the database in its logical format, i.2., what types 
of records are to be maintained, the contents of those 
records, and the celationships anoag those records. This is 
the data as it would be presented to a human, not its actual 
conputer format. In most cases, only the database adminis- 
trator has access to the schema. Another view is the 
external view, also called a subschema, wnich is a subset 
of the logical view tailored to a particular user or appii- 
cation. This is analogous to a "£rndiow" through wüich oriy 
a portion of the total data is seen. Subschemas can be 
utilized to implement security əy restricting a user's 
desess to data. 

Figure 2.3 shows the three different perspectives of 
data in a sample database of students at the Naval 
Postgrađuate School. (A) is th2 conmputer's onysical view 
ani thus is not visible to the human user. (3) shows the 
overall logical view of this small database. (T) is a 
sudset of (B) as it would be seen by a user who is 
interested in only a portion of the database--in this case, 
the senior Army officer who wants information only on Army 


students. 
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Physical View as 'szen' 
1thin the Computer 
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| 
| 
| 
{ - | i 
NAME SSN SERVES SE RANK | | 
e is rs ee EE E EE ETE 

MARKEY, Ronald P- 452-403-6028 USA D=5 i | 
JOHNSON, Bruce M. 348-57-8826 USN O-U | | 

| BROWN, Jennifer C. 512-47-2228 USNE D 
DAVIS, Thomas E. 662-76-8239 USAF Q- 3 | | 

| MASON, Robert J. 823-48-3991 USA DES 
GEIB, Thomas W. 773-348-8725 USN 0-4 I | 
| LANE, Donnan s 371-67-7476 USNR 0-3 | | 
| WILLIAMS, Guy T. 547-23-3410 USA 055 l | 
| 
| 
=> UC CC AC OC ACACO C CU CS COUCH C CAO CC CAU c UC O RO O O MO O A O DD O O DS DO IS ID O A O A O O ee | 
| One External View of the Data | 
(subset of the logical view) | | 
| NAME SSN RANK | | 
| MARKEY, Ronald P. 4462-43-5028 2-5 | 
| MASON, Robert J. 823-48-3991 2-3 | { 
| WILLIAMS, Guy 7. 54 7-23-3410 5-3 | | 


A didi O A E uy a d SS A A A O A O O A A NE M E MNA EN RN NM NIMM NM ENIM JN dX O A A Jis 


Figure 2.3 Views Within a DBMS 


B. THE STRUCTURE OF A DATA DICTIONARY 


There are three kinds of elements upon which the struc- 


ture, or schema, of a data dictionary is DULE: entities, 
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attributes, and relationships. The basic element of the 
dictionary is the entity. Each entity has a unique naae and 
represents an object in the real gorli, such aS a person, 
BEI cr idea about which anforcnation is recorded. ROL 
example, in our Naval Postgraduate School database we 
collected information about students. We also lescribed the 
students by name, Social Security number, service, and rark. 
These characteristics of an entity are called attributes, 
anl can be either quantitative or qualitative. 

A relationship is a logical Link between two entities 
that can also be described by attributes. A relationship 
will fall into one of three categories of mappings:  one-to- 
ong, one-to-zany / many-to-one, or many-to-many. A one-to- 
ons relationship exists when eaca entity oor attribute is 
logically linked to one ani only one other entity or attri- 
bute. For instance, we say that there is a one-to-one rela- 
tionship between an individual's social security number and 
his name. In à one-to-many/many-to-one relationship, each 
entity or attribute is logically linked to one or more other 
entities or attributes. An exampl2 of this is the relation- 
Ship between the instructor of a class and the students in 
that class. A many-to-many relationship occurs when one or 
more entities or attributes is related to one ər more other 
entities or attributes. Por example, there is a many-to- 
many relationship between the attributes "color" and "model" 
of a type of car--each color may be available on many 
different car models and each car model may be available in 
many different colors. 

In order to understand the generic terms we have 
presented in their proper context, it is important to 
differentiate between the dictionary schema itself, tke 
metadatabase that it governs, ani the "real" data in the 
orjanization's database. These concepts are made even more 


confusing because the terninology used to refer to these 
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three levels of data differs fron veador to vanior and from 
author to author. We will look at these levels using the 
Apolied Data Research, ENTES DATADICTIONARY terminology 
(Ref. 4] because it provides the clearest distinction 
between the three. (DATADICTIONARY will be discussed ir 
depth in Chapter V.) 

At the higzkest level >of abstraction, entities, attri- 


putes, and relationships are grouped by type: 


the dictionary schema can than be thought. of as 
containing all existing entity-types, relationsnip- 
typ and attcibute-types, any dae əf whizh will also 


es 
be referred to as a schema descriptor (Ree 5% 

The schema  descriptors are tne general categocies of data 

that is stored in the metalatabase. Figure 2.34 shows ezam- 


ples of some standard schema descriptors. 


| Entityztypes Attribute types Relationship-types ! 

| File Author | contains | 

Record Description Owns | 

| Field Password Processes | 

| Module Status Derived Fron | 

Program Version Resides ] 

| Report Frequency Uses | 

i Jo Security Class Includes | 

Dataview A lias Authority | 
User Comment Reeeoses 

Systen Effective Date 

Process Usage Statistics | 

| 

————— Ó A  Rrs A A E EE 


Figure 2.4 Sample Schema Descriptocs 


At the metadatabase level, we look at specific instances 
of schema descriptors. Thus, we iefine an entitv-occurrence 
as a specific instance of the general category entity-type. 
If PROGRAM is the entity-type, ACCOUNTS RECEIVABLE could be 


one entity-occurrence. Similarly, a relationship-occurrence 
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is a specific instance of the general category celationship- 
type. The celationship-type ACCESS may have as a 
relationship-occurrence  PROGRAM-ACCESSES-FILE. At this 
level, we also talk about the specific characteristics of an 
attribute-type. An attribute-typ2 is the name of a charac- 
tezistic of an əntity-occurrence, as Social Sasurity Number 
SuvidetChiezes austudenteme An attribuwte=charactscisticomis not 
the value of the attribute-type, but the pacameters of an 
attribute-typ2, such as its length ani format. For example, 
the attribute-type Social Security Number will be character- 
izəd as eleven digits long, OG) View CO rq — ogo 9799. 
Entity-occurcences, relationship-occurcences, and attribute- 
characteristics will be referred to as the descriptors of 
tha metadatabase. 

At the "caal" data level of the organization's database, 
we think in teras of actual values of data, such as 
"Jənnifer C. Brown", "547-23-3410", "left-aanded monkey 
UHR IBMUSUS3"U, or "93903", These are all values of 
Wo attributes of vam entity, and moare call: attribute- 


values. 

An example of each of the levels of data is given in 
Imoure 2.5. W2 will use the generic terms entity, attri- 
bute, and relationship in this thesis where it is not neces- 
sacy to distinguish between the three levels. 

When a data dictionary is Lrecaivel from ta? vendor, it 
contains a system standard schema which inciules  certair 
basic entity-types, attribute-types, and relationship-types 
chosen by the vendor. A data dictionary is extensible if an 
organization is able to customize the schema by defining its 
own entity-types, attribute-types, anl relationship-types in 


addition to those included in the system standacd schema. 
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A O 5 cc c A S 
| 
Schema Descriptors Exarple 
Rc DE Resor 1 
Attorbutestype Name 
l Relationship-type COE 3 105 | 
| Metadatabase Descriptocs Example | 
| Entity-octurcrenc Stud cat | 
Attribute-chaEac bep s Nc 2) “Wes eels, .epad= 
nuneric 
Relationship-occurrence SS EDI Oe alae | 
ame 
Database Data Exaaple | 
Attribute-value Ranali P. Markey | 
| 
<= 


c= LS Gee GS ce a ee ee es css ce ee es ee Se ee ee eee eee ARRA. 


Figure 2.5 Comparison of Data Levels 


C. THE FONCTIONS OF A DATA DICTIONARY 


The functions performed by a typical data dictionary 
fall into four categories: definition, updat2, retrieve 
and software interface. À data dictionary should be evalu- 
ated in each zategory according to the ease anl success with 
which the functions are performed. 


1. Definition 


The first step in the implementation of a data 
dictionary is to collect information about some portion of 
an organization's data, such as the U.S.S.  Constellation's 
Supply department. This is done py interviewing supply 
department personnel, identifying the data received and 
produced by the department, and analyzing the software that 
manipulates that data. Once entities, attributes, and rela- 
tionships have been defined, these data elements are entered 
into the data dictionary using the dictionary's iata defini- 
tion conmands. The elements are classified according to the 


entity-types, attribute-types, and relationship-types of the 


System standard schema, 5r the dictionary admiaistrator nay 
use customized data types as necessary, assuming the 
awctgjonary is extensible. 
2. Update 
As an organization evolves, 5s) does its data. One 


Dee. functrons of the lata dictionary íis to allow the 
EHudggtuxonj modification, anal e lo cion O rele ients. For 
instance, a new Navy regulation might require the supply 
department to keep track of certain data about a new inven- 
tory item and to report this data jguarteriy. Ic perhaps the 
adninistrative department will have to change zip codes to 
the new nine-3igit format on all correspondence. Each o£ 
these changes will be introduced via modifications to the 


dictionary schema. 
3. Retrieval 


motorniation Can be retrieved Erom a data dictionary 
by using query language Commands or the report-generating 
em aDility of the dictionary. a ona Will! “provide 
Structured conmanis or an EngliSh-like query language that 
will help th2 suppiy department to find out tae Navy part 
nunber for a monkey wrench. It will also allow the 
dictionary adninistrator to find out which usecs have access 
to a particular subschena. Reports are produced Ey a Gata 
dictionary according to a vendor-jiefinei format or to user 
Speci 2ications. Reports generally produce a larger volume 
response than queries and are often printed out in hard 


Copy. 


ü. Software Interface 


The s»ftware interface functio1 provildas a means of 
aos co cia data dictionary. Lor applicatidas software, 


including  compilers, editors, and databas2 management 
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systens. A COPY command is used to bra latı leser rini on: 


(e.g., of recoris or fiiss)| directly mto the Prosa nos 
developed fron the data dictionary. Thus, tae job of the 
programmer is made easier and data use 1s standardized. t 


is also possibie for applications software to directly 
retrieve and make changes to the elements in a data 


dictionary: 
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III. FEDERAL INFORMATION PROCESSING STANDARD POR DATA 


A. INTRODUCTION 


The Institute for Computer Sciences and Technology of 
the National Bureau of Standards is in the process of devel- 
oping a stanlard software specification for lata diction- 
aries. The Federal Information Processing Staalard for Data 


—  u-— A  —Ó — es oo — UND A Am A A cum uno uA A = -p we we Uu ayp wp O A oo De ee A O A ee D EP ED mm A = 


Dictionary Systems (FIPS DS) is intended to serve asa 
guideline for the evaluation and selection of lata diction- 
aries to be used by the federal government. The four 
volumes! "specify and describe tha functionality, database 
structure, and user interfaces of the FIPS DDS" ,Ref. 6]. 

We examined three volumes of the FIPS DDs: Commani 
Language Interface  Specirications (volume 2), Interactive 
Interface Descriptions (volume 3p ani Dictionarv 
Iw ILIStrator Support Specifications (volumes 4). The 
subject of each of the volumes corresponds to one of the 
three categories of users who will interact with a data 
dictionary--the experienced user, the relativaly inexperi- 
enced user, and the administrator of the lata dictionary. 

The FIPS DDS describes in detail a sugyested system 
standard schema for a data dictionary, including definitions 
and use of the schema descriptors. Each of the volumes 
presents the syntax for commands necessary for its target 
users to manipulate the dictionary. a ome the 


results of each command ace detailed, with error messages 


and "successful completion" messages listed where 
applicable. 
l Note: Volume 1 is not yet available foc ceview. The 


ESTOS ist draft form and has not been formally approved 
by the National Bureau of Standards. 
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B. SYSTEM STANDARD SCHEMA 


The system standard schema set forth in the FIPS DDS 
provides basic entity-types, attcibuta-types ani 
relationship-types as follows: 

Entity-types 

1.  SYSTEM--a collection of processes and data 

2. PROGRAM--an automated process 

3. MODULE--an automated process which is a logical 

Subdivision of a PROGRAM or an independent process 
called by a PROGRAMS 

4. FILE--an organization's data collection 

5. RECORD--logically associated data which belones ta 

the organization 

6. DOCUMENT--hunman-readable data coliectioas 

7. ELEWENT=-datagbeloaging totke ocg: m ON 

8. USER--members or collections of menbers belonging to 

the organization using the facilities available in 
the data dictionary 

9. DICTIONAEAN=USER == ASAS O e dictionary system 

itself 

10. ACCESS-TONTROLLER--specifies access restrictiors to 

an entity or set of entitizs in the dictionary 

SYSTEM, PROSRAM, and MODULE ace of the ciass "Process"; 
FILE, RECORD, DOCUMENT, and ELEMENT 252 Of the siass "Data"; 
USER is classed as "External", ang DICTIƏNARY-USER and 
ACCESS-CONTROLLER are of Ghewetasce ure 


ds e 


There are 55 attribute-types included in the system 


standard schena, similar to the ones shown in Figure 2.4, 


Relationship-types 
The standard relationship-types provided by FIPS are as 


follows: 
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1. CONTAINS--describes entities composel conceptually 
of otker entities 


2. PROCESSES--snows the relationship between a process 


and data 
Jj. RESPONSIBLE-roOk--shows the association between enti- 
ties representing organizational components and 


entities lenoting dEganizationil cesrousibility 

4. RUNS--shows the relationship between a user anda 
process 

5. TO--shows the flow between two processes 

be) DERIVEO-FRIM—--Shows that an entity is the result of 
some operation on another entity 

Ure FIPS DOS includes an 2xtansibility facility to 
provide for the customization of the system standard 


schema to match the organization's neels. 


C. COMMAND LANSUAGE INTERFACE SPECIFICATIONS 


The experienced user is one who is familiar with the 
structure and commands of the data dictionary and who needs 
Areso th Cuil functionality of the data dictionary. 
Conmand language commands are used to facilitate this access 
by allowing the user to: 

--define iata elements 

--maintain the dictionary (add/molify/deleta| 

--report on dictionary elements 

--query the dictionary about data elements 

--build entity lists aad perfocm operations on groupings 
of entities that meet certain criteria (useful for global, 
vice individual, operations) 

= usposteupsircations programs that zntescact with the 
data dictionary 

--perform general utilities, such as changing the mode 


of operation and obtaining help infornation. 
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The syntax of each of the command language commands is 
presented in the FIPS DOS using Backus-Naur  form.? For 
example, the following command would be used to modify ar. 


entity that already exists in the lictionary: 


MODIEY-EDIIILS 
([ WHERE] NAME [IS] <entity-nane> 
[ADD NEW-VERSION [<version~-number> }] 
WHERE ATTRIBUTES [ARE] Cattribite-clause-=-1> 
[sosa , .Sattrribute-clsu eno 

where: 

--entity-name refers to a single  ertity in the 
diet Lona ny 

--NEW-VERSION is an optional clause which results in the 
creation of a new entity which has 4 primary-nane consisting 
of the assiznel-name of the entity-name specified and the 
next-highest version-number 

--attribute-clause-n refers to a clause used to desig- 
hate the attributes of the specified entity which are to be 


modified 


D. INTERACTIVE INTERFACE SPECIFICATIONS 


The interactive interface for the relatively inexperi- 
enced user is designed to lead the user step-by-step through 
tha desired operations. Without having to master the 
ciimand language commands, the 1ntsarastive interface user 
has a large subset of the total functionality available 
within the data ASEO including manipulation, 
reporting, querying, and entity list operations. The. P BS 
DDS recommends that this interface b2 implementei by means 
of "panels" (screens) that are presented to the user in 


sequence and which contain the following information areas: 


eBackus-Naur form is explained in Appendix A. 
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l. state area--tells the us2r where (in which 
dictionary) he is and what he is doing 
S data2 area--fOr ent=tamg anjl iisplaying lata 
3. schema area--used mostly for dictionary updates to 
Show available options and limitations on actions 
4. message area--for error messages and warnings 
5. action area--tells the usec how to proceed from the 
current panel 
6. help area--£or the display of help information 
requested by the user 
The user begins his session with the data dictionary at 
a "home panel" which provides entry into the system. At any 
point along the way he has the option of saving or undoing 
any panel with which he has been working. This panel-dri ven 
interface ensures that the user always knows where be is in 
the dictionary, what mistakes he has male, what choices he 


has to continue, and what help is available to him. 


E. DICTIONARY ADMINISTRATOR SUPPORT SPECIFICATIONS 


The administrator of the data dictionary, of course, has 
access to both the standari command language and the inter- 
active interface. His or her main concern, however, is the 
management of the schema. This is acconmplishel by means of 
a specialized set of commands for 

--extending the system standard schema 

--reporting on the schema 

--implementing access control measures 

--controlling export from and import to the dictionary. 

We have already defined the 2xtensibility facility as 
the ability to add schema descriptors to the system standard 
Schema. Dc -eporptefacilitv allows, thea @eadnuinistrator to 
generate a listing of the entice schema or any subset 


thereof. The security facility provides commands [Jr 
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restricting the access of users to the dictionary by speci- 
fying which commands the user is allowed to execute. The 
export/import facility allows transfer of parts of one 
dictionary to another, but only between dictionaries whose 
schema are idantical in order to preserve the integrity of 


the "target" dictionary. 


F. EVALUATION 


It is certainly true that the FIPS DDS ocesents the 
reajer with vary detailed specifications of the commands and 
facilities for a standardized data dictionary; the volumes 
we reviewed could serve as the basis for an initial design 
Specification for the development oc¢ data diztiənary soft- 
wace. A dictionary based on the FIPS Specifications wouid 
perform the required functions discussed in Chapter II and 
would contribute to the organization's management of its 
data. The militacy and the federal government would benefit 
greatly from the availability of standard software to 
achieve control over its data resource. 

The major contribution of the FIPS DDS is its orienta- 
tion to the needs of the different kinds of users of a data 
dictionary: This is particularly evident in the interface 
that is suggested for use by inexpərieazədJ usecs of the data 
cic tionary: The panel-driven focnat with its six informa- 
tion areas is far less intimidating tnan the syntax required 
by the commard language. Even so, the interactive interface 
Still reguires a certain degree of sophistication on the 
part of the "inexperienced" user if he is to be able to 
Manipulate the dictionary. Another strong point of the FIPS 
DDS is its consistency of presentation and format. No 
matter what the operation, the procedures needed to manipu- 
late the dictionary and the mannec in which the dictionary 


"responds" to the user are logical and predictable. The 
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conmands, however, are complex and require knowledge of 
Baceus-Naur forn. 

Even though the FIPS DDS does indeed provide a compre- 
hensive software standard for the computer professional, we 
do not believe that it achieves its goal o£ providing a 
guide for the evaluation and selection of data iictionaries. 
Although the addition of the introductocy volume may help 
renedy the problem, the three volumes of specifications 
ignore the forest of reasons behind the implementation o2 a 
data dictionary while concentrating solely on the patterns 
of the leaves on each tree. The FIPS DDS will not be 
extremely useful to the individual searchiay for basic 
assistance in evaluating commercial data dictionary pack- 
ages. Many of the books and articles we have reviewed 
provide better explanations of data dictionary features and 
conprehensive evaluation criteria. 

We found that the terminology that thea FIPS DDS uses for 
the dictionary schema and the metadatabase is not explained 
clearly nor is it any less confusing than that of any otner 
publication. In addition, no specific examples of how an 
orjanization's data would be entered in the data dictionary 
are given. We feel that it is more iaportant for the poten- 
tial data dictionary user to understand how a data 
dictionary will assist in the manajement of data than to see 
Samples of every conceivable type of error nessage that 
cud occur. A summary of recommended featuras such as the 
one we have just presented and a list of criteria for evalu- 
ation would be far easier for the teader to digest. 

None of the data dictionary packages we aave reviewed 
feeeewthings totally the "FIPS way", and it is anlikely that 
any commercial dictionary vendor will ever conform exactly 
to FIPS DDS guidelines. However, it 1s likely that the 
feleral government will insist that FIPS standards be 


Aro rated to Euture Ji1ctionacies intendel for govern- 


ment use. In the next chapter we will develop a set of 
criteria for an "ideal" data dictionmaey 9 tac iene ee 
recommendations into account. In Chapter V we will examine 
foar commercial data dictionary packages and evaluate their 


success in meeting the ideal criteria. 
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IV. THE ROLE JF THE DATA DICTIONARY IN INFORMATION 


In this chapter we will see how a data dictionary can 
contribute to the goal of efficient management >f an organi- 
zation's data. 4c will first discuss tha process of devel- 
opnent of an information system in an organization and then 
will discuss the three objectives of lata dictionaries that 
we have identified as contributing the most to the accon- 
plishment of this goal: Ja bg*esecurltumqlacdieuntegs.tynhadnd 

set of 


criteria for the "ideal" data dictionary to b= used in the 


documentation/maintenance. He will then develop a 


evaluation of data dictionary packages. 


A. INFORMATION RESOURCE MANAGEMENT 


Organizations today have becone increasingly aware of 
the need to manage data just as they manage other essential 
resources. If properly managed, the necessary data will be 
available, up-to-date, and retrievable when required to 
provide infornation that is of value to the organization. 
TRY, although it might also be referr2i to as Data Resource 
Management. 

IRM has been the focus of a great deal of interest in 
recent years. In October of 1989, the lastitute ÉSE 
Conputer Scieaces and Technology of the National Bureau of 
Standards (NBS) and the Association for Computing Machinery 
(ADM) co-sponsored a workshop on IRM strategies and tools. 


It was based on the premise that 


IRM is curcently one of the most significant topics 
being discussed concerning information  syst2as, and LS 
eno discussed along a variety o£ lines of thought. 


SM 


These include business systems planning; information 


systems analysis, design, ana development; dac OE 
design and implementati5n; the disciplines of office 
Management, paperwork management, and information 


sciences management; ani the various problems and costs 
associated with implementing IRM to inclule each of 
these areas. [Ref. 7] 


Th2 Proceedings of the workshop definei IRM as 


iba fever poem: jee or procedure concerüuipge up 


nation oth automated and adn-automated Sl 
which management establishes to Serve tas Overal 

current and future needs of the enterprise. SUCH SP ONE 
cies, etc., would include considerations ofi avail- 


ability, timeliness accum c 1 ae Se ae puse ae 
security, auditability, ownership, use, and zost effec- 
tiveness. (Ref. 8] 


Th2 recommendations of the NBS/ACM workshop on Ehe roie that 
the data dictionary shouli play in IRM were incorporated 


into the Federal Information Prozessinj Stanlard for Data 


In order to understand ROW "tas data Gletloniry corte. 
utes to the production of valuable information for an orga- 


nization, we will look more closely at the organization 


itself and at its functions. An organization is made up of 
many systems that convert resources into usable output. Àn 
information system, then, is one that takes raw data and 


trinsforms it into information that can be used by the orga- 
nization. If the process by which the organization develops 
its information systems is the heart of information resource 
management, then it is the data dictionary taat keeps it 
treking: 

Assume that the U.S.S. Constellation has identified a 
problem with the way a particular information system is 
currently operatıing--it could b prev ntie ma 
record-keeping, the supply deparmtnewt inventory, the 
pecsonnel administration system, or a system that affects 


the entire organization. The process of analyzing the 


system and developing a system to solve this probler evolves 
through four listinct phases, called the System Development 
Meee Cyle (SDLC). We will show how the data dictionary 
supports the SDLC, and thus, IRM, through planaing, study, 
design/coding, and operation and maintenance. de have based 
our analysis ori the SDLT on that of  Leong-Honj and Plagman 
MESE 9]. 


1. Planning Phase 


ihe Proceedings of the NB3/ACM workshop emphasized 
bNesneed c ior a "top-dowsn" approach tə IRM in an organiza- 
tion. During the planning phase, the organization's long- 
range plans, its functions, and structure ara analyzed to 
ensure that any information systen that is developed will 
complement those needs. 

If a data dictionary is alceady in existence, it can 
provide information about the functions of th» organization 
that have beea defined, or it can docuneat the initial defi- 
nrrtron of those functions. ' For 2ask function, it must be 
determined who does it, what is produced, what other func- 
tidns it interacts with, and what inputs are n22ied to acon- 
plish the function. As an exanple, “e can say of tne 
Payroll function that it is performed by the diskursing 
OLC ice, payzhecks and leave and earnings statements are 
produced, it interacts with the personnel aiministration 
System, and it requires lata about all members of the crew, 
including rank/rate, time in service, and so on. 

At this stage of the development process, the "big 
picture" is irawn while the details are left until later. 
This, general categories əf data such as "accounting data" 
and "personnel data" and the transactions that affect thea 
ace defined and entered in the dictionary. 

In th» aggregate, this planning infornation consti- 


tutes a conceptual data model. ")efinition ani analysis of 


5 


subsequent information requirements (and eventually,  data- 
base design) will be dependent upon this data model" 
(Ref. 10). The fact that the devz2lopment of tais model has 
been automated, rather than manual, ensures a quicker, stan- 


dacdized process. 


2. Study Phase 


At this point in the SDLC, a greater level of detail 
is introduced. The data dictionary provides a constan 
dardized source of information about the inputs and outputs 
of the organization's functions. Specific entities, attri- 
butes, and relationships are chosen from the general catego- 
ries of data identified in the planning phase. The entity 
PART in the Constellation's inventory system may be 
described by the attributes Navy Part Number, Description, 
Storage Location, and Quantity. There may also be a many- 
to-many relationship assigned between PART and DEPARTMENT. 
Reports requiced to be produced are also defined and the 
hecessary input data is identified. 

This information provides what is called a detailed 


al model, an expansion of th2 conceptual model of 


2 poe phase. The data dictionary can be used to 
identify redundancy within the data model by determining 
whether the data entered already exists. In adiition, wien 
the aid of the dictionary, the systems analyst will be 


able to qeber min what data is d eR cr how it is 
being used how it can be acc ASSI ho aas primar 

responsibility | for its de finition ind Do and mos 

pnportant her there is conflict in usinj ic data, 
that 1S, what impact it RES on other application 
systems (Ref. 11]. 
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3.  Design/-oling Phase 


The purpose of the design phase is to provide speci- 
fications for programming and implementing tàe systen. EE 
deere that the data dictionrary's schema dessriptors will 
be used or expanded to meet the  naeis of the systen. if a 
database does not already 2xist, and it is datermined that 
ons is required, the data dictionary schema will provide a 
basis from which to implement one. Data integrity is 
enforced because the dictionary serves as the sale source of 
data definition and structure. 

When software 1S being coled, the data dictionary 
provides documentation for the PEOgEamMer ahd a COPY 
facility for transporting cecord l“definitions, for example, 
into the program being developed. An important element of 
the dictionary is the constraints that are defined for data 
values. In this way, data that is input to a program can be 
checked against the constraints that have been estabiishei. 
Documentation of the program includes the authoc, a descrip- 
tion, input requirements, output produced, and information 
on what other programs are called upon, all of which are 


incorporated into the data dictionary. 


4. Operation and Maintenance 


After a new system has been inplementeil, the work o£ 
the data dictionary does not end. All of the documentation 
that has been recorded during the devalopment of the systen 
serves asa base of reference for the users f£ the system. 
In addition to the database administrator and the adminis- 
ptor of the dictionary, the kay players ¡a information 
resource management who benefit from the use of a data 
dee onary tall into Six groups, aczcoriing to Allen, Loonmis, 


ani Mannino: (Ref. 12] 
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1. The data administrator, who is responsible for the 
overall administration of the data resource, uses 
the dictionary as a tool to enforce the way data is 
stored, maintained, and monitorei. 

2. Data processing managers benefit from the diction- 
ary's reports on data usages. 

3. Operations personnel retrieve information from the 
dictionary about jobs that are being run. 

4. Programmers and analysts use the dictionary to 
retrieve lata definitions and to docunznt a systen 
being developed. 

5. End users access the data dictionary for descrip- 
tions of their dataviews. 

6... Finaily; auditors will use the locumentation 
provided by the data dictionary to tcace data and 
programs as they are used in the computer system. 

It is the process of implementing a data dictionary that 
we have just described--the analysis of the organization, 
the definition of its functions, and the documentation of 
its information systems--that makes the dictionary so impor- 
tant in information resource management. Me have seen that 
ducing the development of an information system, the data 
dictionary is involved from the initial planning stage, 
through the programming process, through the operation, and 
into the maintenance of the system. The dictionary provides 
the standards for data which will be used throughout the 
life of the system and referenced when developing other 
systems. Key contributions include decreasing the amount of 
reiundancy of data requireij to be stored, enforcing security 
of the valuable data resource through access controls and 
implementation of user views, ani providing documentation 
which serves as a "corporate history" and as a reference 
upon which maintenance and auditing are based. These objec- 
tives of data dictionary usage ace discussed in detail in 


the next section. 
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B. OBJECTIVES OF A DATA DICTIONARY 


In this section we will focus 9n the three najor contri- 
butions of the data dictionary to th2 management of an orga- 
nization's data. Ihese acefdata secuprtsNEdita integrity, 
enl documentation/accountability. Although we recognize 
Ius other obDjectrves of data dictionary usage might be 
identified, w2 helieve that each will fall into one of these 


theee major groupings. 
1. Data SJecucity 


There are two distinct levels of security of the 
data in an organization's database which will be provided 
either by the data dictionary oor by the database management 
system itself. First, procedures should exist to ensure 
that only authorized personnel are allowed to access the 
information contained within the Jatabase. The widespcead 
us2 of computers and the increasing sophistication of users 
has made an organization's data vulnerable to» enbezzlers, 
amateur "hackers", corporate spies, and careless employees. 
Second, the systen should contain pcovisions foc controlling 
the amount and types-of data that each authorized user is 
allowed to access within the system. Some of the sophisti- 
cated data dictionaries, for example, include a trace mecha- 
nism which incceases security by recording every inquiry 
that is made into system files and data. If an intrusion is 
male into the system by unauthorized  pecsonnel, the 
spacrsftuics of. that inquiry, including the data which was 
accessed, will be recorded. 

Metadata should be affocded at least the sane 
protection, if not more, than the data in the database. 
Leong-Hong and Plagman (Ref. 13] present an exampie of the 


importance of the security of metalata as it concerns 
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the data resources in a Ol applications 
such as the classification coda of intellijerce doçu- 
ments. Yhen security profiles fc the metadata ertitres 
ace stored in the_ metadatabase,  unauthorizesil access to 
the metadata couid be most damaging. This is because 
oresumably one would be able to 'crask into! the system 
using that information. 


It is the task of the dictionary admrinostrator seeemana se 
the metadata to determine the levels of security recuired 
ani to grant access privileges (read and write, read orly, 
uplate) to users for certain portions of tas metadata. 
Information about users, their password, and priviieges is 
stored in the data dictionary and is accessible only to 
personnel authorized by the administrator. 

We have already shown in Figure 2.3 that subschenas 
contribute to security by limiting the size of the "window" 
through which a database user looks at data. When a user 
attempts to access a particular sibschema, the request is 
routed through the data dictionary to determine whether 
access is authorized and, if so, the structure or the 
subschena. Jnly at this point is the "real" data in tie 


database accessed. 
2. Data Integere 


The keys to data integrity are the control of inputs 
to the database and the minimization of data duplication. 
Properly used, these keys will enhance communication betyeen 
users by ensuring that a single, correct source of data is 
maintained. 

Because the data in a database is shared among many 
users, it is essential to have some means of enforcing stan- 
dards for entering data, updating it, and maintaining it. 
Foc example, the data dictionary ilenmtifies coastraints, or 
linitations on the values data can have. Fields can be 
defined as being mandatory or optional, alpaanumeric or 


nuneric, and a minimum or maximua length. The data 
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dictionary contains comments on how data should be used in 
Gamer co aSSist those using the data distionacy. another 
important control feature of a data dictionacy is how it 
deals with synonyas--an entity or attribute with more than 
on2 name. For instance, the entities EMPLOYEE, 
RESIONAL MANASE8, and EXECUTIVE may all be usel by different 
departments in the organization to refer to Linda Smith. 
The administrator must standardize th2 terminology used in 
the organization and eliminate as nany synonyms as possibie. 
When this is not feasible, all of these synonyms, E 
aliases, nust be recorded in the data dictionacy. Van Duyn 
(Ref. 147] explains that 


It is not unusual to have similar ues of data elements 
in the database and in various applications... In such 
cases, and in cases where the same data type is known by 
other names, the DDS ‘data EUM, can be used to 
inform the users of the relationships that 2xist among 
BE ese data and of the disposition o£ their usage. If 
other words, the DDS provides information as O which 
dd and systems use the same data type and 
how they relate. 


The data dictionary also  -Contributes to data rntegj- 
rity because it ceduces the necessity for duplication of 
nd therefore lessens the opportunities for error. The 
information about the components of different subschemas of 
the same logical view is stored in the data dictionary in 
place of the lata itself. A user, whather writing a progran 
or creating a new entity-type, should be able to query the 
data dictionary to ensure that the necessary routines or 


entities do not already exist within the systed. Perhaps 


one of the most important benefits of DDS (data diction- 
icies] is that because it gives. accurate and tinel 
information, management can control more efficiently no 
only the automated and manual data əf the enterprise but 
all its resources and operations.  Zonsequently, manage- 
tent is provided with precise and accurate data QI 
quick, profitable decision-making [Ref. 15]. 
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Thus, the possibility of two users querying the latabase and 
receiving different answers to the same question at the sane 


tine is decreased. 


3.  Documentation/Maintenance 


Because maintenance is the most expensive and time- 
consuming phase of software development, docanentation and 
maintenance of the organization's data is probably the most 
Significant objective of the data lictionary. It is a fact 
of software life that documentatioa is often avoided during 
system development and program design. TO a Large extent, 
this is because documentation can be prepared as an "after- 
th it is not essential to the operation of the 
system. But when a system is developed that includes a data 
dictionary from the beginning, the data which is required by 
the data dictionary forces documentation to become an inte- 
gral part of the design. "The use of a dictionary provides 
documentation of a quality and form that 15 Simply not 
available through less formalizel proceiures in the data 
processing environment" (Ref. 16]. 

The data dictionary can also reduce the amount of 
effort requir2d by maintenance personnel because it provides 
"3 "'roadmap' for the programmer doing maintenance. dae 
records the programs being maintained, their data structures 
and their relationships" [Ref. 17]. Ye have defined an 
active data dictionary as one in which information is 
created, accessed, or modified through the data dictionary 
interfaces with new or changed metadata automatically stored 
in the data dictionary. This "continuous maintenance" can 
be used to allow the database administrator to monitor where 
data is used, who uses it, how often it is used, and what 
Changes have been made to it. Because the data dictionary 
provides a wealth of documentation, it is possible to trace 


an "audit trail” through the organization's data, fron ūser 
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nanes and department to the kind of lata used in a program 


Eommow Many records a certain field appears in. Also, 


mac tracking Of how progtans/modules use particular data 
is well as which files/Segments contain certain data is 
extremely important to the systems analyst in performing 
system changes. Through the DDS [iata dictionary], he 
9t she is able to ascéctain what impact the proposed 
changes will have on other components of th2 system and 
upon functionar areas within the enterprise. y having 
3n accurate up-to-date assessment of the Location ana 
usage of lata that will be involved in the system 
change, the analyst can accomplish the task more 
əffičĉiently [Ref. 18]. l 


Once an organization has decided to make a commit- 
ment to manage its data using a data dictionacy, it nust 
decide what kind of data dictionary best suits its partic- 
ular needs. In the next section, we will look at the 
features of what we have called thea "ideal data dictionary" 
as a basis foc evaluating the many commercially available 
data dictionary packages from which the organization must 


choose. 


C. THE IDEAL DATA DICTIONARY 


Hara dentofied the Euanctions of a data dictionary in 
Chapter II and how they support the accomplishment of the 
objectives just discussed, it will be helpful to use these 
concepts to evaluate data dictionaries. Thea "ideal" data 
dizstionary would be one that possesses all the capabilities 
necessary to support all potential users in all possible 
applications. However, this ¡deal dictionary would be 
impossible to conceptualize, much less to creatas. The ideal 
data dictionary for an organization will depend on the orga- 
mentions size, functions, and needs. The potential users 
of a dictionary will have to develop a set of criteria upon 


which a candidate will be judged. 
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Many references provide criteria for evaluating data 
dictionaries and identify these <csharacteristics which are 
vital to the nanagenent S the data resource. 
Unfortunateiy, it Xs difficult to find two ret crete waa 
propose the same criteria. One excellent source, Leong-Hong 
anl Plagman [Bef. 19], lists nine categories foc evaluation: 

1. data description facility 

2. data documentation support 

3. metadata generation 

4. security support 

5. integrity support 

6. user interface 

7. ease of use 

8. resource utilization 

9. vendor suppor 

It is important to recognize a distinction between two 
categories of criteria for the ideal data dictionary: those 
that evaluate the vendor and  oparating environment, and 
those that evaluate the “data dictionary MES NES in tne 
former category, items like vendor support ani reliability, 
th» choice between free-standing or DBMS-12pendent data 
dictionaries, the degree of integration with other systen 
conponents, and the quality of system docum2ntation are 
important consiierations that may drive the decision between 
tw> comparable data dictionaries. Ft rs. 10wever, the 
latter type of criterion that will be vital in identifica- 
tion of the essential requiremants of the ideal data 
dictionary. We have grouped all such requirements into six 
categories: System standard schema and extensibility, 
command and query languages, ease of use (including menus), 
security, documentation ani reports, and application inter- 
faces. (He have assumed that th2 objective of data integ- 
rity will be accomplished by the correct, and enforced, use 


of any data dictionary.) If a particular dict leona e O NS 
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ENUDOPLs each of these six criteria then it will most likely 


meat all of the organization's data management needs. 


1. System Standard Schema and Extansibility 


The ideal data dictionary must provide a system 
standard schema with all the descriptors necessary to 
support the range of applications rceguiced by the organiza- 
tion while still being simple enough to be competitively 
priced. It nust provide "enough" descriptors to be fully 
capable without providing so many that the schema becomes 
confusing. Additionally, the ideal dictionary must support 
the user (of data dictionary administrator) lInEmol: 3 ng 


existing schena descriptors and creating new entities, rela- 


tionships, aod attributes. This extensibility is vital in 
supporting applications specific to the sEjJanizatwven’s 
needs. 


2. Command and Query Language 


10) 


The ijeal dictionary must provide both command and 
quary languagas. The command langiage must support creation 
and modification of data structures and subsequent entry of 
data into those structures. The command Language must 
mode cdit commands to facilitata aldition,  aodification, 
ani deletion of system data. It should include commards 
restricted tə use by the data dictionary administrator, 
.J., password assignnment. The ideal system will include a 
query language to support the analysis and production of 
usable information from the organization's data. Perhaps 
one of the most important features of a data dictionary (and 
database), ¿query languages allow data to be screened ir 
order to provide concise and specific information to support 


timely management decisions. 
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Ease of use, orc user-friendliness, is another impor- 
taat aspect of the ideal idata arose onsacy. It must be 
supportive of new users while still providing full  func- 
tional support of the system "experts". Two prinary en te: 
dients of usar-fciendliness are the availability of menus 
and carefully conceived examples in the dictionary's refer- 
enz2 manuals. A hierarchy of menus can reduce complex oper- 
ations to a series of smaller, frienilier st2ps while user 
dosumentation provides easy-to-understand examples that 
guide the inexperienced user through each phase of system 
op2ration. AS microcomputers and the concept of the auto- 
mated office continue to spread, 2asa əf use will become an 
ev2n more important consideration in deciding which software 


products to utilize: 


4. Security 


Security will be a vital concern of the ideal data 
dist loners Protection and control of system information 
must be provided. The data dictionary administrator must be 
provided the capability to control personnel access to 
system data. He or she must also be able to grant different 
degrees of access to different users. Similarly, users 
should have the capabilities to protect, and grant access 


to, those structures and data which they control. 
2. Documentation and Reports 


The documentation and reports created by the ideal 
data dictionary must also be clear and unierstandable. 
Timely and accurate preparation of reports is a key objec- 
tive of any DBMS. The data dictioracy is unigJaaly gualified 
to assist with this functions By ensuring th2 integrity of 


data accessed and supporting guery commands, the ideal data 


qu 


dictionary can  provice reports ani documentation to answer 


specific questions as they arise. 


6. Application 


cd e -— 


The final important characteristic of tae ideal data 
dictionary is its ability to interface with the other appli- 
cations that may exist in the organization. If the data 
dictionary is free-standing, it should interface with many 
of the currently available database management systems. If 
DBMS-dependent, the dictionary should interface with all 
components of that system. Additionally, tkə ideal deta 
dictionary should interface with code generators, coumunica- 
tion systers, and other agents of the us2rs* environment. 

In the following chapter, we will study and evaluate 
four of the popular data dictionaries that are currently 
available. He will use these characteristics of the ideal 
data dictionary that we have aefined to compare and contrast 
the features of the four dictionaries. In addition, each 
will be compared to "standard" dictionary presented in the 
Pees DDS. 
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V. EVALUATION OF COMMERCIAL DATA DICIIONARIES 

The purpose of this chapter is to review and evaluate a 
cross-section of commercial data dictionary packages. Ne 
selected four dictionaries: DATA DESIGNER,  DATAMANAGES, 
ORACLE, and DATADICTIONARY. User documentation and library 
sources were the primary sources of information for oar 
evaluation. Additionally, ORACLE was availabl2 on the Naval 
Postgraduate School's Vax miniconputer, ani we observed 
demonstrations of DATA DESIGNER and DATADICTIONARY. 


A. DATA DESISNER 


DATA DESISNER is a free-standing data dictionary devel- 
oped by Database Design, Inc. It was introduced in 1975 
with the goal of supporting logical database design by 
solving some of the traditional problems associated with 
muliple-application database management systems, such as 
duplication of data, excessive storaje requiraments, data 
consistency, complexity, and podifianil ity: JATA DESIGNER 
can be used in conjunction with a variety of database 
management systems, including IMS, IDMS, ADABAS, NOMAD, and 
others. Additionally, it can produce designs that will 
interface with COBOL and other non-DBMS5 tools or systems. 

DATA DESISNER can be characterized as an 


automated, easy-to-use tool that assists the database 
designer in fotmulating normalized views, of the data 
requirements and syntnesizes these views 1nt> a canon- 
ical normalized form |. - DATA DESIGNER maintains 
information neejied to physically structure ta2 database 
for efficient performance [Ref. 20]. 


} 
In addition to providing the standard cunctilons  onmmaeoded 


dictionary, DATA DESIGNER goes several steps beyond. [t 
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provides an extensive set of comaands categocized as user 
coumands, edit commands, and plotting comaands, as shown in 
tees It also supports limited production of models and 
graphics. Furthermore, DATA DESIGNER's capabilities include 
powerful generation options and report features that will 


support the design and maintenance of applications. 


| TABLE 1 | 
| Standard Commands of DATA DESIGNER | 
| User Commands | 
| ADD BATCH BUILD | 
| COPY CREATE EMPTY | 
END ELL ES GENERATE 
HELP HIERARCHY PLOT 
PRINT RENAME REPORT 
SHOW OPTIONS TRANSFER VALIDATE | 
| 
| Edit Commands | 
cum uu m— — —— eS ee ewe ee | 
| DELETE EDIT INSERT | 
| LIST RENUMBER REPLACE | 
| Plotting Commands | 
DRAW DONE RETURN | 
SET ALT SBE DEVICE SET RANGE 
SET TITLE SET TYP SHOW 
AS A MESE EE. — a ca aes ca “ne a ck take ance dus amp qup. 


DATA DESISNER supports logical database design through a 
five-step process: 
1. A data dictionary file is created that contains a 
list of all standard data item names to be used. 
2. Subschema files are created that describe all of the 


views necessary to support usar Jata requirements. 
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3. The encodei user views are validated. This step 
verifies the syntax of each view and ensures that 
each lata item nane listed ai ci ac ea 
dictionary.: 

4. All of the verified user views are synthesized into 
a logical data model. Reports and liagrams are 
generated to reflect this nodal. 

5. The model is evaluated to ensure that it meets all 
user requirements and is modified as necessary by 
repeating steps (1) througa (4). 

DATA DESIGNER utilizes three kinds of files: dictionary 
files,  subschema files, and generated design files. A 
dictionary file ($DIC) contains a list of all data elements 
that will be used in an application or subschenac This list 
serves as a base for further development, e.g., additional 
Views. A subschema file ($SUB) contains data items and 
relationships pertaining to particular vie2ws. Finally, the 
generated design file ($DES) contains a logical data model 


generated by DATA DESIGNER using the applicable dictionary 


and subschema files as input. The generated iesign files, 
Jm turn serve as the input for the report and graphics 
functions. 


Key commands utilized during the creation əf a logical 
database design include the following: 


CREATE--lefines dictionary and sibschema files. 
BUILD--enters data item names into created files. 


VALIDATE--compares the subschema files to the 
dictionary file. 


GENERATE--creates a logical DB design from the 
validated files. 


REPORT Dre EE E documentation for the 
logical design. 


PLOT--uses the plotting subsystem to draw the 
logical desigque 


EDIT--supports modification of existing files when 
necessary. 
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tae) ets acqualntErReN realer vith tac operation of 
pun DESIGNER, We will lemonstrate the  lialog associated 
uM Leach Step of the prosess necessary tolcreite our Naval 
Postgraduate School database example of Chapter II. The 
Heer Of DATA DESIGNER must first create the dictionary file 
STUDENT.DIC and the subschema file SIUDENT.SUB (user inputs 
are indicated by boldface type) as follows: 


CREATE STUDENT.DIC DICTIONARY 

PrECO III la "STUDENT.DIC" of type "$DI-" created. 
2CREATE STUDENT.SUB SUBSCHEMA 

DDFCO2031 Fils "STUDENI.SUB" of type '"$SUB" created. 


Next, the BUILD command is used to load iata iteus into the 


tuo created files. First all possible data itens are listed 
in the dictionary file: 


>BUILD STUDENT. DIC 
DDBSOO65I The file type is $DIC. 
DDBS0018I There are no records in the file. 
B>NAME 
B>SSN 
B>SERVICE 
B>RANK 
B>DONE i A 
DDBS0064I File building is donz. 
DDBSO0681 Y records were entered 
DDRNOO98I Line 1100 is now the last line in your file. 


The subschema file will support creation of one or more user 
views. In our example, the suoschema file contains two 
views, the basic, overall view and the view intended for 
Aray use only. Notice that after the user enters the BUILD 
process, each line must start with a modeling code. These 
Coles are usel to identify components and to establish rela- 
tionships within the views. When building the subschema 
files, all desired relationships must be specifically 
stated. DATA DESIGNER uses "1" to specify 1 one-to-one 
relationship and "M" for a one-to-many relationship. A 
conplete list of the modeling coles used in this exauple 


appears in Table 2. 
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>BUILD STUDENT. SUB 
DDBS0075I The file type is $5U3. 
DDBSOO81I There are n3 records in the file. 
B»V,STU-1 
o a aaa al dilo lee dede de 
* THIS VIEW SUPPORTS THE OVERALL VIEW * 
A o o al le e a de 
B>F,0100 
B>T,0003 
B>K, SSN 
B>1,NAME 
B>1,SERVICE 
B>1 RANK 
B»V,STU-ABRMY 
aR KKK KAKA ARK RAK RK KKK KER RK RRR KAAS 
* THIS VIEW SUPPORTS THE ARMY VERSION * 
kek cde obe cle ode o obe clc ole cle oc cc oc clc olco oic ok oe cic oic cuc oc ole oce oe oc oic oie oic cin oi ok cix c oic oe 
B>F,0125 
B>T,0002 
B»K,SSN 
B»1,NAME 
B»1,RANK 
xs kc oic ete ok ala a adolfo eoe e 
B»DONE 0 
DDBS0064I File building is dons. 
DDBSOO681 13 records were entered 


( UU TUIS aaa RT 
| | 
| TABLE 2 | 
DATA DESIGNER Modeling Codes | 
Cole Modeling Use | 
V Name à user view 
| E Specify frequency of use. | 
T Specify req'd response times 
K Name a ter 
e Concatenate keys and data | 
S Concatenate keys in short way 
L Label a data O I | 
M Identify a nultip e association 
1 Identify a single association 
N Name àn association 
* Insert comnents | 


Once the dictionary and subschema files are formatted, 
the VALIDATE command is used to ensure that all entries and 
relationships in the subschema files are valid based on the 


information previously specified in the dictionary file. 
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DATA DESIGNER will respond vita the number of views 
processed, the number of lines real, and the number of vali- 


dation errors, if any, that were located: 


>VALIDATE STODENT.SUB STUDENT. DIC 
DDVS0013I Validation begins. 
DDVS0024I 2 Views were processed. 
DDVSOO25I 13 lines were read. 
DDVS0015I O validation errors were detecteli. 


Once the files are successfully validated, the user will 
utilize the subschemas to jenerate a logical database design 
for his or her application. 

The ten SENERATE options from which the us2c can choose 
acre powerful features that allow the user to control the way 
that DATA DESIGNER produces a design and supports requests 
foc varying degrees of information ducing the generation 
process. If the SENERATE command is called without options, 
DATA DESIGNER will create a design that removes all redun- 
dant data elements, generates intersection lata groups as 
necessary to cesolve many-to-many celatioaships, suppresses 
repeating data elements within data groups, generates single 
key Gata groups from concatenated keys, and considers all 
frequency and timing information that was contained in the 
subschena files. In all cases, tha end product of the 
GENERATE command will be creation of a $DES file, in this 
case, STUDENT.DES. The factory user's guide cecommends that 
options 4, 5, 6, O9, and 10 be used when jy2nerating the 
initial design or after major revisions to the input files. 


A brief description of each generate option is shown in 


maple 3. Continuing with our student databasz2 example, the 
user's dialog will be 
»GENERATE OPTION 8 56 9 10 TO, STUDENT. DESIGN 
DDGSO0321 Design generation begins. 
DDGSO058l The Subschema file 15 STUDENT.SUB 
DDGS0214I Option Y ignores undefined links. 
DDGSO281I Option 5 generates forelgn key information. 
DDGS0301I Jption 6 generates candidate key information. 
DOGSOSO7T Option 9 generates cross-reference initio. 
DDGSO307I Option 10 ignores frequency and timing info. 
DDGSOOS54I Design generation has finished. 


2 





TABLE 3 
DATA DESIGNER Generate Jptions 


Option Purpose 


| 
| 
| 
Generate unspecified associations. | 
Suppress resolying rədundant dati. | 
Suppress creating intersection files. 
Supress generatidg inverse links. 
senerate foreign key information. | 
Generate candidate key information. | 
Allow repeating data items in groups. | 
SUppcess SS Single key groüps. | 
Generates cross-referense information. | 
Suppress frezuency/timing information. | 
J 


QWuOCo-JOw nd LN 


y 


At this point, the logical database design is completed. 


When using the options specified in the example, a series of 


reports will be automatically generated. A list of reports 
| TABLE 4 | 
| Reports Available with DATA DESIGNER | 
| Report Type | 
ccc SEE em | 

1 Data Group Links Report | 

2 Canonical Schema Report 

3 Dara OO Index Report 

4 Multiple Occurences f Data Itens. 

S Data Relation Report 

6 Data Group Candiates Keys Report 

7 Data Item tc User Vi2w -ross-Refereanace 

8 User View to Data Group Cross-Reference 
b E Data Group to User View -ross-Reference | 


Created is contained in Table 4. To print these reports, 
the user's dialog will simply be 


az 


>REPORT 123 4 567 8 9 PRINTER FROM STUDENT. DESIGN 
DDP200731 The reports were printei. 


As a final aid in evaluation of the logical database 
design, DATA DESIGNER is capable of proiucinj diagrams of 
(1) an overview of the logical database design and/or (2) a 
hierarchical representation of that logical iesign. Lo 
produce the logical overview diagram, the following dialog 


is required: 


>PLOT 
DDPT0289I DATA DESIGNER Print Plot Relaase 2.5A 
P»SET TYPE OVERVIEW 
P>SET TITLE LOGICAL-DESIGN 
P>DRAW FROM STUDENT. DESIGN 
DDFSO310I Design STUDENT.DESIGN's description loaded. 
DDNX0271I The overview plot generation is ione. 
P>RETURN 
P>END 


After using the printed reports and diagrams to svaluate the 
database design, the user will, if satisfied, transcribe the 
design into a specific DBMS format, such as ADABAS, or use 
DATA DESIGNER'S EDIT capabilities to revise tie design as 
necessary. 

As discussed in Chapter IV, data dictionaries can be 
evaluated on the basis of their accomplishment of security, 
integrity, and documentation/maintenance. DATA DESIGNER, as 
a free-standing data dictionary that can be usa1 in conjunc- 
tion with a variety of DBMS and non-0BMS syst2as, does not 
address the security aspect. It was apparently designed 
with the assumption that the parent system with which DATA 
DESIGNER interacts will handle access control and other 
security-related functions. 

DATA DESISNER does, however, receive high marks for 
maintaining data integrity and for the guality of its docu- 
mentation. Because it is designei to support the develop- 
pent of logical database designs, it utilizes its dictionary 
files to ensure that duplication of data is prevented 
through generation of cross-reference files. When a 
subschena is modified, DATA DESIGNER again utilizes its 
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dictionary files in the subsequent design genecations. the 
PLOT and  REPORI functions provide a wealth of information 
about the design, its components, and all users of the 
subschemas. Relationships, botk those included by the user 
and those produced by DATA DESIGNER, can be s22n in written 
reports and visual representatinons. When modificatons and 
new designs are produced, the reports are automatically 


updated to reflect all changes. 


B. MSP DATAMANAGER 


DATAMANAGER, developed by  M3P, INC. 26E SLexarngton, 
Massachusetts, is one member of the MANAGER family of 
distionary-oriented software products. Jther products 
include DESIGNMANAGER, PROJECTMANAGER, SOURCEMANAGER, and 
TESTMANAGER. The entire line of products, while capable of 
batch operations, is designed specifically to support inter- 
active operations with IBM  Á360/370/30xx/4300 series (and 
plug compatible) computers. While DATAMANAGE3 is designed 
as a nucleus for further expansion or specialization, it 
provides all basic capabilites necessary to create and main- 
tain user dictioniri es. Additional capabilitzs, available 
as a series of extra-cost, add-on nodules, include: 

1. interfaces to IDMS, ADABAS, I4S, TOTAL, SYSTEMC2ZOUUE 

and other DBMS 

2. teleprocessing interfaces 

3. generation of COBOL, PL/I, ocr other source language 

data descriptions 

4. generation of DATAMANAGER data definitions from 

existing COBOL or PL/I source code 

Je interfacing of a DATAMANAGER dictionary to user- 

written programs 

6. status, audit, and security facilities 

7. extensibility through a user-defined syntax facility 


54 


AOON GER Can provide data dictionary capabilities to 
users utilizing a variety of hardware/software combinations. 
By providing interface modules foc several popular database 
nanagement systems, DATAMANAGER is obviously nore flexible 
than one that 1s tied to a single, distinct database systen. 
However, DATAMANAGER'S flexibility extends beyond the 


obvious: 


DATAMANAGER 1s 1ntended for use in any ‘organization in 
which there is a computerized data processinz function. 
Pts Use,  h»wever 1S not confiaed to those $lenents o£ 
data that are held in computer files or, that are acted 
upon by computerized systens.. Definitions of all data 
beld and used by an Sra EM On. in its manaal systeas 
is weil as its computerized on2s Came elas TN a 
DATAMANAGER data dicionary. DATAMANAGER is designed to 
be used both with traditional files, powerful database 
systems, and in a mixed environment. Us2 of the data 
dictionary cemains independent f the database manage- 
nent system, although further add-on facilities enable 
DATAMANAGER data definitions to be generated directly 
from the database data description langaije source 
peding. | Ref. 


The architecture, or structure, of the DATAMANAGER data 
dictionary is composed of four (or five) data files, called 
data sets in the user documentation. 


The source data Set contains tha data dafinitions as 
originally input into the system by the user. then the user 
moldifies or appends changes, the data definitions are auto- 
Matically updated within the file. 

The data entries data set contains all encoded data 
definitions generated by DATAMANASER after evaluating the 
contents of the source data set. Data definitions are 
encoded to reduce the time reguiced for DATAMANAGER to 
Process the information within the data dictionary. During 
this encoding process, relationships, aliases, and classifi- 
Cations are also identified. 

The index data set is an autonated index containing the 
name and addcess of each entity definition that is in the 


source data or data entries data sets. The index data set 


S 


serves as a lata directory to support the fastest possible 
retrieval of entity definitions and associated lata. 

The error recovery data set is used by tha» system as a 
tenporary backup storage file. This capability was imple- 
meated to increase reliability by providing for automatic 
recovery of the dictionary contents in the case? of external 
interruption or other system failure during a dictionary 
update. 

The log lata set is an optional capability that is 
highly recommended by MSP. All updating sommanis,  associ- 
ated data definitions, and amendments ace logged into that 
file as they occur. Entries inclujie command iientification, 
full date, time, user, and status of all physical input/ 
output accesses. Additionally, the data administrator has 
tha option of specifying that all connands dicected to the 
data dictionacy be logged. When combined with other system 
backup racilitias, this allows DATAMANAGER to be "rolled" 
focward from the last backup point in case full recovery is 
ever required. | 

DATAMANASER is a powerful syst2m that utilizes a series 
of interactive commands to create, naintain, and document 
data dictionary contents. These standard commands are 
listed in Table 5. DATAMANAGER provides a predefined series 
of standard entity-types, relationship-types, and attribute- 


types that form the system standard schema. These are 
listed in Table 6. As shown in Tabla 6, DATAMANAGER uses 
only six entity-types in the standard sch2ma. Those 


elements exist within the system as members əf a logical 
hierarchy as shown in Figure 5.1. Discussion in the user 
documentation reveals that DATAMANAGER strives to provide 
the capability to maintain all system data whil2 maintaining 


ease and simplicity of logical design. 
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| 


| 
TABES 5 | 
DATAMANAGER Standardi Commands | 
ADD ALSO ALTER | 
AUTO RE TY BULK CƏPY | 
DICTIONARY DOES DROP | 
ENCODE ENDDMR FORMAT 
GLOSSARY NS RT KEEP | 
LISI MODIEY PERFORA 
PRINT PROTECT REMOVE | 
RENAME PEPLACE REPORT 
SHON STATUS WHAT | 
| WHICH WHO WHOSE | 
| { 
TABLE 6 


DATAMANAGER Standard Schema Descriptors 


EROCESOTENIIIY-IIYPES 


| 
| 
| 
MODULE PROGRAM SYSTEM 
DAIACENTITI-IYPES 
FILE SROUP IIEM | 
INT ATIDUSHIP-ITYPES 
SBE | 
ATTRIBUTE-TYPES i 
NECE>>=AUTHDOLITY ADMINISTHAIIVE-DATA | 
ALIAS CATALOSUE 
COMMENT DIES BNET TON | 
EEPES TIWESDATA Regen 
NOTE OBSIJLETE- -DAPRE 
QUERY SESGURIIWLCDASS 
BENED o s sed 
A complete specification lata resource of an 


rganizat10n requires the c 
the A a ESA of 
is ušed. 


tics and of : 
contexts in which the d 


efinition of the characteris- 


lata, and of the 


AeGoGiingly, the 


design of DATAMANAGER provides for a hierarchy of member 


PERI within which Ut s ossible to describe all 
elements andi assemblages of data and the prosessess that 
act on the lata. The number of nember types ldefined for 
the basic hierarchy has been kept as small as possible 
“«hile meeting these reguirements. (Ref. 22] 


Figure 5.1 


--— emp o O A A A Pe 2 SS Cee Pe RAD O RD A a a = 


DATAMANAGER's Hierarchy of Entity-types 
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át the iowest level, an ITEM is a fundamertal element of 
data, the smallest unit within  DATAJANAGER. A GROUP is a 
collection of items or other groups. The thiri entity-type, 
the FILE, can either be implemented as a traditional file 
Seyamrzation (a collection ot data groups, inləpendent of a 
DBMS), or as the equivalent association of Jata within a 
database. If DATAMANAGER is used with a database, another 
entity-type, DATABASE, will be provided with the database 
iMtertace module, e€.g., ADABAS, that is selected. The new 
Member, in this case, ADABAS-DATA3ASE, will either replace 
the FILE el2ment within the hieracchy, oem ocoexist DY 
residing between the FILE and MODULE elements. A MODULE 1s 
a zollłlection of data that includes descriptions of a data- 
base (if used), FILES, GROUPS, and/or ITEMs. The module is 
the lowest unit that can directly or indirectly manipulate 
data, and is a subdivision of a PROGRAM. [The PROGRAM is 
defined in terms of collections of  molulas and those 
processes that input or output data to/from the systen. A 
program is executable. A SYSTEM is the highest element of 
the DATAMANAGZR hierarchy and contains all subordinate data 
declarations. l 
While  DATAMANAGER stresses simplicity in the logical 
design of the system standard schema, it can de configured 
to be highly extensible. An add-on nodule, the User Defined 
Syatax Facility (UDSF), is requirel to support user declara- 
tion of schema descriptors. If present, this facility 
provides several unique capabilities. First, in addition to 
allowing the user to define his or her own entity-types, the 
module allows the data administrator t> insert one (or more) 
of three standard sets of extended entity-types. These sets 
ar2: 
1. The Extenied Data Processing Structure (EDPS) which 
proviles additional  entity-types frequently used 
within the data processing  envirorement. These 


dne PROCEDURE, SUBROUILNE, and DATASET. 
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2. The Structured Analysis Structure (SAS) wEich 
proviles entity-types frequently used when 
Condu> bing structured design. These include 
SUBPROCESS and DATUSTRUCPUNER 
3. The Struczturei Development Structure (SDS) which 
strives to provid2 all 2ntity-types aecessary to 
Satisfy the requirements of the majority of poten- 
tial users. This collection of entity-types include 
all those found in the EDPS and SAS subsets. 
Seco na the UDSF module supports user ldefinition of 
attribute-types related to both system standaca and user- 
created entity-types. Three distinct categories cf 
attribute-types are recognized within DATAMANAGER. These 
ace; 
1. Global (common) attribute-types which will apply to 


tity-types Within the structure, eg T 


Ui 
tj 
Q 
c 


attribute-types which can be added to those 
of a specific standard eatity-type, for example, 
RILE: Whenever a user defined entity-type is 
created that uses the standard entity-type's format 
as a base, the generic attribute-types of the stan- 
dard entity-type will be passed into ta2 new entity- 


type. 

3. Specific attribute-types which allow th2 designer to 
tailor an entity-type to satisfy th2 particular 
requirements of that organization. 

Finally; the UDSF module supports user Jəfinition of 
relationship-types in both forward and backward directions. 
This enables DATAMANAGER to support the thr2e (or four) 
relationship mappings we have previously described. 

Once DATAMANASER is installed on the computer, two major 

Steps must be conducted before information can be entered in 


the data dictionaty. First, an empty data dictionary must 
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be defined using Controller commands (restricted to use by 
the data aiministrator), DICTIONARY, 1301 AUTHORITY. 
Briefly, th3— dictionary? must be created and opened, 
authority levels must be defined, and potential users must 
be identified. As the second major implementation step, 
member entity-types, both standard and user-created, must be 
be defined. Every session with DATAMANASER is conducted as 
a "run", in which a series of system commands, specified by 
the user, are carried out. Every session must initiate with 
th2 commands DICTIONARY and AUTHORITY. After ceview of the 
user documentation, this process will probably seem diffi- 
cult and confusing to most users, even to those who have 
wocked with other data dictionaries. DATAMANAGER is, 
however, an impressive, powerful package in th2 hands of an 
experienced user. Our sample databases, STUDENT, would be 
entered as a FILE (or DATABASE, if implemented). The format 
foc an indivilual student's record becomes a GR2JP, in which 
each data element, e.g., servic2, SSN, etc., becomes an 
ITEM. The structure of our example, after implementation in 
DATAMANAGER, Would appear as shown in Figure 5.2. 
DATAMANAGER aggressively supports each of the three 
objectives of data dictionary usage: data integrity, 
security, and maintenance/docunmentation. It enforces data 
Mtegrity thrəugk its hiscarchical structure of entity- 
types, predefined standard schema celationships, identifica- 
tion of aliases, and automatic update procełures. System 
definitions ard error-checking are used to validate the 
structural "correctness" of each entity, relationship, and 
attribute as it is created or defined. Once the FILE or 
DATABASE is dafined, DATAMANAGER monitors input of data into 
system structures by comparing the input to the appropriate 
ITEM's characteristics. Each of the MSP products, including 
the DBMS interfaces, displays evidence that 4SP recognizes 
ths inportance of data integrity as a vital link to effi- 


cient and dependable control of data. 
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Figure 5.2 STUDENT example in DATAMANAGER Structure 


The DATAMANAGER nucleus provides security by inclusion 
of one type of security mechanism, password control. The 
Controller, or dictionary administrator, naust assign a 
unique passwocd to each authorized user. Each user and 
password combination must be registered within the 
di tionary: DATAMANAGER will reject any command session 
which does not commence with an AJTHIRITY connand followed 
by an authorized password. 

Several additional security machanisms can be provided 
by including the Audit and Security Facility  ntoda@lesine..] 
System implementation. First, the ‘Controllsr gains ue 
capability of registering general and specific security 
levels > within the dikewron aay. Each user may be assigned a 
general security level in addition to the unique password 
previously assigned. Within the system, the Controller will 


assign a specific Insertion Security Level and a specific 


on 


BRO tcectión Security level. A usec whose genecal level is 
Ser than “tae Specific iasemtion level is not allowed to 
Miser t, modify, or delete information within the data 
EE t 1onary. This provides the capability to assign "read 
only" access. À user whose general level is lower thar the 
specific protection level, Or ona who does not have a 
general security level assigned, is not allowed to establish 
protection for system members, or iata structures. 

Second, users who do have a geaeral securit; level equal 
to or higher than the specific protection level may use the 
PRITECT command to assign protection to specific members in 
thə form of AZŽZCESS, ALTER, and REMOYE security levels. This 
capability allows key users to coatcol, or even prohibit, 
access to those structures that they own. Any member which 
is not owned but does require security zan ba assigned the 
Same three control levels by the dictionary adninistrator. 

Finally, the Audit module provides the sapability to 
produce over 500 different audit reports, using information 
cootained within DATAMANAGER. The majority of these reports 
are reserved for use of the dictionary administrator alone. 
This includes the capability of l33yiny all conmands issued 
to the system. Tais "trace" mechanism increases security by 
providing a record of all entries, or attemptel entries, to 
the systen. 

The last significant objective of a data dictionary must 
be to support maintenance and docimentation of the informa- 
tion contained within the information system. DATAMANAGER 
provides a set of commands unique to the maintenance func- 
tion. A listing of these is shown as Table 7. Maintenance 
can be supported during both interactive and batch sessions. 
À series of query and repoct commands are provided with the 
nucleus module to support usage studies, maintenance, and 
dozumentations. These commands are listed in Table 8. The 
REPORT, PRINT, ani GLOSSARY commanis provide a jreat deal of 
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TABLE 7 
DATAMANAGER Maintenance Commands 


| 
| 
| 
j | 
| 


INSERT MODIFY ENCO 
REPLACES BULK ENCODE COPY 
ADD RENAME ALTER 
REMOVE KEEP DRƏP 
| ALSO KEEP PERFORM 
| 
LI qum um am) Qua GP A UM cum MEN VU ESAE GU eng uus E GS que eus cmq Rm wu GP m RC C E oT 
| | 
| TABLE 8 | 
| DATAMANAGER Report/Query Commands | 
| Report Commands 
PRINI BULK REPORT LIST 
SWiII-H REPORT SKIP | 
GLOSSARY SPACE BULK PRINT 
TEXT | 
Query Commands | 
WHAT WHO WHICH | 
WHOSE DOES SHOW | 
| 
J 


information to the dictionary adminsistratoc and other 
designated users. When system data is modifi21, the query 
ani report commanis can be used to provide updated documen- 
tation and records. 

One additional  DATAMANAGER Capability  wacrants mention 
with respect to maintenanca and documentation. One systen 
entity-type which has not been discussed and does not reside 
in the hierarchy shown earlier is the COMMAND-SIREAM entity- 
type. This structure is a unique feature of DATAMANAGER 
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that allows previously stored series of commands to be 
executed by using the PEBRFOR!I command. The us? of specific 
CONMAND-STREANS can be compared to the subroutines of a 
general programming lanugage. While the COMMAND-STREAM can 
be used in many ways within DATAMANASER, it becomes espe- 
cially useful during generation of reports and documentation 
ducing maintenance sessions. A "subroutine" czan be speci- 
fied that will produce all standard reports; when systen 
information is updated, the applicable reports are produced 
by one sinple PERFORM command at the end of the maintenance 
session. 


C. ADR DATADICTIONARY 


DATADICTIONARY is one of fourteen separate, but highly 


integratec, software products produced by Applied Data 
Eesearch, Inc (ADR). Initially inbtuoduceédocrn. 1978, the 
integrated systen, Relational Information Management 
Environment (RIME), is considerei to be ona sf the first 


true exanpios of the fourth generation of systems software. 


Three conditions are certain in the  1982s....Tirst 
applications packages will not neet the need. for mos 
applications that Will be computerized.  Seconi, svstems 
software products that improved productivity, reduced 
application costs and increased Ee Lownie On 
nation in the 1970s will be even more valuable in the 
980s. "n third, existing a as Will not be 
readily rewritten or replaced an Jill have to be main- 
tained for many years....The success or failure of many 
organizations. in the 1930s will depend on how effec- 
tively poner ove and integrate data processing in 
their operations. This is E R ccitical roD 
organizations that have been tralitional data processin 
users over the last 20 years and have worked «ith secon 
and third A e e Mainframe hardware and soltwere 
systems. (Ref. 23] 


Paor co analyzing ADR'S data dictionary, it is important to 


review briefly the objectives of fourth generation software 


and integrated systems and to provide an overview of RINE. 
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Each of the "generations" of system software can be 


identified by ore or mora Significant advancements. The 
ficst generation provided primarily assenbiy language 
prograns. The second generation's gifts centered around 


development of high-level languag2s and improved operating 
systems. Numerous advances surfaced during the third gener- 
ation, e.g., database managenent systanas, data lictionaries, 
structurec programming technigues, early efforts at decision 
support systens, aad program generators. During the fourth 
generation, it is anticipated that advances will occur in 
three primary areas: very high-level languages, relational 
database management systems, and thea automated office or 
integrated information center. In thea latter, all automated 
functions, 2ncluding data processing, wori processing, data- 
base and file management, decision support, projyram develop- 
ment and maintenance, and communications, will be combined 
into one "total" systen. This could, in theocy, be accon- 
plished by one giant program, or, in the case of ADR and 
other vendors, as a series of smaller, integrated packages. 

During 1992, the U. S. Army awacied a Contract for the 
largest, most complex information processing project ever 
funded by the government. Naned VIABLE (Vertical 
Installation Automation Baseline), th2 project will provide 
a nationwide automated network that will connect forty-seven 
military bases to massive computer power at five regional 
data processing centers. Ihe network has been designed to 
support the management of information in peacetime and in 
tines of war and other national emergencies. During the 
planning period, interest center2i on three principal func- 
tional areas: communication, interactive program develop- 
meat, and database management. Ihe primary contractor, 
piectronic Data Systems, selected 11 of ADR‘s products for 
us2 as the base of the VIABLE system. A complete list of 
ADR/RIME elements is included as [fable 9. 
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TABLE 9 
components of ADR's DATCOM System 


Component Function 


| 
| 
l 
I 
I 
I 
4 
I 
| 


Relational Database 5 
ONARY Resource Control Syst 


(2 


| 

| 

ystem | 
3m 

SOG iS Line Query Language | 

Into. Retrieval/Reporting | 

On-iine Data Entry Systen | 

| 

| 

| 

| 


H HW 
O 


DH O 
ty 
ro 


Extended Language/Utilities 
Program fanagement System 
Program dalntenance System 
Real-tine Measurement System 
ug Pre-compiler 

System Davelopment Tool 
Distributed Database Network 
Electronic Mail System 
interactive Develop. System 
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Some of these elements can b2 considered to be high- 


priced extras or application-specialized options. If an 
organization were to utilize all components, users would 
have access to a complete database system with data 


dictionary, a relational query language, rəport anā graph 
generators, 2əxtended COBOL compiler, program development 
support, distributed local data network, elzctronic mail 
system, and more. 

According to ADR literature, the heart of the integrated 
system is DATADICIIONARY. The company's database systen, 
DATACOM/DB, a true relational databas23 system that utilizes 
a patented flexible data structure, was designed especially 
to “interact with DNTADICTEONARY. As aa active âictiorary 


3A relational database is one in which the relationships 
between data are implied by the values of the data. FOr 
example, two records ara relatel if the have the same 
HN buteE5 STJDENT and PROFESSIR ace relatecs by the fact 
that they are associated with a particular CLASS. 
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System, DATADICTIONARY is gueried by all other components of 
the system prior to access of system information. This 
maximizes data integrity while minimizing data redundancy. 
DATADICTIONARY offers a menu-driven user interface. pt 
provides security, supplies full documentation/maintenance 
capabilities, and can be extendel to interact with future 
system products and to support future user requirements. 
. The documentation provided with DAPADICTITIONARY and other ADR 
packages is almost overwhelning in its completeness. The 
dictionary alone has fifteen separate volumes. While an 
extremely capable system,  DATADICTIONARY is aot one that 
will be easily or quickly nastered. 

DATADICTIINARY provides 20 stanlard entity-types in its 
system standard schema and supports user creation of addi- 


tional, more application-specific schema descriptors. For 
most applications, the standard types Jlistel in  Tabie 10 
l 
TABLE 10 | 
ADR DATADICTIONARY Standard Entity-types | 
DATABASE KEY SYSTEM REPORT 
AREA SLE MENE PROSRAM JOB 
Pres LIBRARY MODJLE STEP 
RECORD MEMBER DATAVIEN AUTFIRIZATION 
FIELD PANEL PERSON NODE 





sn 


will prove to be sufficient. DATADICTIONARY maintains a 
logical hierarchy among the principle standard entity-types, 


as indicated in figures... Many of the standard entity- 
types are provided with primary relationships already 
defined with key subordinate entity-types. For example, in 


our STUDENT example, we will  inicially use tn? entity-type 
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Figure 5.3 A Logical Hierarchy of Entity-types 


DATABASE to create our sample dataoase. When we define the 
database entity-occurrence, the DATABASE-ARBEA relationship 
is automatically provided. Sn when the area- 
occurrence is defined, the AREA-=FLLE relationship is estab- 
lished by DATADICTIONARY. In the case of RECORD, Creation 
of an occurrence provides three relationship-types: 
Poe e- ee eien ee ORU-KEY, and RESJRD-ELEMENT. These three 
relationships, at the lowest level of the logical hierarchy, 
support actual entry of attribute-valu2əs, or data. Whether 
system-definel or user-created, all relationship-tvpes in 
DATADICTIONARY have four attributes: 
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1. Relationshlp-mnapping-= de SD eS the number of 
entity-occurrences which are the subjects and the 
objects of this relationship,  23.g. taz type of the 
relationship. DATADICTIONARY supports four types of 
relationship mappings, i.e.  one-to-one, one-to-many, 
many-to-one, and many-to-many. 

2. Reguiced-relationskip -- describes whether each 
entity-occurrence in the named object entity-type is 
to be related to at least ona entity-occurrence of 
the naned subject entity-type. 

3. Automatțic-relationship - describes  waether each 
entity-occurrence of the naaued object 2ntit;-tvve is 
to be automatically related to an entity-occurrence 
of the named subject entity-type when the ckject is 
added. 

4. Ordered-celationship - describes whether the order of 
relationships added in tais  relatioaship-type is 
significant. An ordered-relationship allows entity- 
occurrences to be retrieved ani displayed in a 
specific orier. 

If using the interactive version, DATADICTIONARY Online, 
the user will be prompted by a series of panels, oz menus. 


The Master Menu is displayed in Figura 5.4. Th»? mas-:er menu 


supports creation, modifization, ani deletioa of entity- 
occurrences. Additionally, it provides access to all other 
system menus through option (7). The following procedures 


would be utilizel to create the SIUDENT example within 
DATADICTIONARY. First, the Add Datail routine, optzon (2), 
is selected. In answering the syst2m prompts, che usec 
creates the naw entity-occurrence, DATABASE.STUDEN'C in the 


following dialog: 
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DDOL: SELECTION CRITERIA FOR DETAIL ADD 
LY TY E&NTITY RECORD DD OCCURRENCE NAME VE2 STAT 
00 E DATABASE STUDENT 00 1 
CURRENT OCCURRENCE QUALIFIER: 
DATABASE STUDENT (001) TEST 
MK KK Kk KR KK KKK kek kok tek ok ke e e e e e e eek ok KK KK we KKK eke KK 
DETAIL A2D 
ATTRIBUTE VALUE 
DESCRIPTION NPS STUDENT DATABASE 
CONTROLLER DEPARTMENT OF RESISTRAR 
AUTHOR REGISTRAR 
BASE-ID 001 
BASE-TYPE ADR/DB 
DBMS-USED RELATLONAL 
| A n QE X. — { 
| MASTER MENU | 
ENTER THE REQUESTED OPTION ==> THERE ARE 03 OPTIONS | 
1. DISPLAY MENU MENU FOR DISPLAY FUNCTIONS i 
2. ADD DETAIL ADD DETAIL ENPITY-OCCURRENCZ | 
3. DELETE DETAIL DELETE DETAIL ENTITY-OCCURRENCE 
| ü. UPDATE DETAIL UPDATE DETAIL ENTITY-OZCCURENCE | 
5. COPY COPY/MODEL ENILTY-OCCURRENCE 
| 6. STATUS CHANGE CHANGE ENTITY-OCCURRENCE STATUS 
7. SUPPIRI MENU ALIAS, DESCRIPTOR, RELATIONSHIP, 
TEXT, AND JLIS? 
8. SECURITY OSSUÉRENCE SECURITY MAINTENANCE 
| 
a a ces See sme len “mes me sume ara a eens samp RS ed amin al 


Figure 5.4 ADR DATADICTIINARY Master Menu 


Each of the 20 standard entity-types will contain predefined 
key attributes. Values for theses attribute-types are 
entered during the Add Detail routine. In the case of the 
DATABASE entity-type, and as was shown above, the key attri- 
butes are DESCRIPTION, CONTROLLER, AUTHOR, Pisa 
pu TYPE and PBMS- USED. 

In similar fashion, the user must create th2 subordinate 
logical structures, AREA- STUDENT, ELLE STUDENT, and 


PL ORD STUDENT: As each occurrence is created, it must be 
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related to tha next highest entity-occurrence ia the locical 
hierarchy, e. qu FECES STUDENT nust be related ES 
AREA. STUDENT: For this process, the user invckes the 
Relationship Definition Panel to lefine the c3lationships. 
DATADICTIONARY will respond with the Relationship Definition 
Display which presents the characteristics of each of the 
relationships as it is enacted. Examples of these panels 


ace shown below: 


=> 

DDOL: RELATIONSHIP DEFINITION 

RELATIONSHIP N SINTERVAL 

SUBJECT ENTITY TYPE DATABASE. STUDENT 

OBJECT ENTITY TYPE AREA. SIODENT 

> A A 
RELATIONSHIP DEFINITION DISPLAY 

SELECTION: 

SINTERNAL DATABASE.STUDENT AREA. STUDENT 

NAME SUBJ TYPE CBJ TYPE MAP REQ AUTO ORDER 

SINTERNAL  DATEBASE AREA im Y N N 


As a finai step in installirg the STUDENT database,  OLIST 
commands must be used to define specific fields, keys, and 
elements within RECORD.STUDENT. This is the point where the 
Specific attributes of the STUDENT exanple, e.g., SSN, Nane, 


Secvice, and Rank, are entered into tha database design. 
The user defines attribute name, parent, crass, type, 
length, and number of repetitions. One exanple of this 


process is as follows: 


JEN m $$ — 

DDOL: SELECTION CRITERIA FOR RECORD QLIST "AINT 

LV TY ENTITY RECORD DD OCCURRENCE NAME VER STAT 

00 E RECORD STUDENT TEST 
CURRENT OCCURRENCE QUALIFIER: 

REZORD STUDENT (001) TEST 
3k ok ol ole oe ok ole Ke ee eK ok A A A AA NA A A A RA KK KK KKK KKK KE 
RECORD QLIST MAINTENANCE 
E FC FIELD NAME PARENT NAME INSERT AFT C T LEN REP 

A SERVICE N NAME S Cc 004 001 
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Looking at the last line o£ the figure, the aser has indi- 


cated the following: 


He euUneSt 1 On PESI = Add a field 

Field Name = SERVICE I 

Parent Name = SSN (in this case, this is tas Key field 
Insert After = NAME (NUMBER's value will follow NAME's 
C E = NES (as ọpposeł to a -ompound field 

a EE Character [vice a numeric or binary field) 
LE! en th or riella) EN 

REP (Numoer of Eepetitions) - 991 (vice a cepeating 


field) 

At this point, the schema of STUDENT has been entered into 
DARADIC*TONAPI. The usec may now use DATACOM/IB facilities 
to enter attribute-values into the system. Upon completion, 
tbe Gatabase administrator or authorized users can create as 
nany external views, or subschemas, as desirel. 

DATADICTIONARY receives high marks in the areas of data 
Ace grlty, Se cube, and docunentation/maintenance. 
DATADICTIONARY's logical hierarchy of structures and systen- 
atic installation procedures tend to enforce data integrity. 
The  dictionary's extension routines and view generation 
processes have been written to ensure that data integrity is 
maintained throughout expansion or specialization of the 
database. To enforce security,  DATADICIIONARY provides 
multiple iayers of protection. Two separate ani independent 
mechanisms are provided in all implementations. These are 
(1) use of entity passwords, and (2) inclusion of locks and 
override codes. If the installation is the Online version, 
a third mechanism, user validation, is available. As each 
entity is created, oor at any time afterwards, a four-digit 
password can be assigned to that entity. Passwords can be 
either unique or assigned to a series of related entities. 
Any user attempting to modify or ascess a password-protected 
entity-occurrence will be queried to provide the applicable 
password prior to gaining access. The second layer of 
protection centers on use of LICK and OVERRIDE coces. 
Unlike passwords, which either allow or  proaibit access, 
codes rean be -utilizsd to limit the degree of access 


granted. Three levels of security are providel: 


T3 


LOCKO No restrictions esist on anert ti 
(default setting) 


LOCK1 The entity cannot be updated or deleted 
Without an override code. The entity 
can be copied, displayed, or printed 
Witnout restrictions. 


LOC K2 No action will. be pernittedi unless the 
override cole is given to tie systen. 


The actual override codes will be ised dictionary-wide, that 
is, a single code will exist to satisfy LOT%1 conditions 
while another code exists to access entities protected by 
LOCK2. Finally, if using DATADICTIONAR YS online, the kighest 
layer of security becomes user validation. Tha name of each 
user of the system is defined as a PERSON entity-type. Each 
entity-occurrence will include a unique password which aust 


be provided to enter the system through the online inter- 


face. Four levels of authorization are supported by 
DATADICTIONARY: 
DIS The user is allowel t5 3isplay alb data in 
the dictionary. 
_UPD The aser is allowed tò uplate the lictionary. 
_COP The user is allowed to copy an entity. 
_ADM The user is allowed the use of all commands 


and is allowel to process all panels. 
Authorization at one level will automatically provide all 
lower authorizations. 

ADR's multiple-layered approach to security provides a 
system that is both highly flexible and very secure. The 
database administrator will be able to provide whatever 
degree of access that is required to sach individual user as 
well as to each group of users within the system. If one 
layer of security is broken, access will be pr2vented by the 


other security mechanisms. 


Invocation of any function thus authorized on any entity 
is still subject .to the password ani lock provision 
liscussed earlier in this section. Thus, „a ūser with 
BDD UPD authorization cannot modify an entity that is 
password protected unless the required password is 
supplied. (Ref. 24] 
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DATADICTIONARY provides extensive capabilites to support 
maintenance and documentation of the data dictionary. IE 
can be maintained by using eithec the Online maintenance 
facility or available batch commanis. If using the online 
facility, a series of screen panels will again guide the 
usar through the desirei maintenance activity. Tkis 
facility will greatly enhance ¡individual changes, however, 
najor changes affecting numerous entities would be initiated 
nost easily through batch comnmands. In either zase, mainte- 


nance centers around four principal functions: 


Ia da copying, updating, or deleting system 
entities 
2. search for, identification of, and creation of 


entity aliases 

3. maintenance of descriptors and schema i2scriptors 

4. maintenance of descriptive tests associated with 
system entities 

Similarly, DATADICTIONARY provides numerous report 


generation capabilities, most of waich can be initiated 


theough either batch oC Online Maintenance sessions. 
Pcincipal report ypes area shown in Table 11. Generated 


reports will support both the initial generation of user 
databases and subsequent maintenance of system data and the 


structures utilizei to displey it. 


TABLE 11 
Principal Reports of DATADICTIONARY 


NTED 
SS 


EAE 


DETAI 
ALIAS 
DERIN 


ZU F3 F3 


ND 
EX 
EL 


rr 


75 


D. ORACLE 


ORACLE is a relational database management system devel- 
oped by Relational Software Incorporated of Menlo Park, 
California. It was originally developed for use with 
Digital Equipment Corporation PDP minicsomputers and has been 
converted to operate on IBM mainfranes as well (Ref. 25]. 
Included in ORACLE is a dependent data dictionary that 
pecforms a limited number of the functions liscussed in 


previous chapters. 


Data is stored in ORACLE as relations, or  two- 
dimensional tables, which are organizezd into rows and 
columns. SOL (System Query Language) is usei for query, 


maaipulation, definition, and control of thə JRACLE data- 
base. Information about the contents of a table, its 
creator, authdcized usercs, calling programs, and associated 
views is kept in the data Cictionary and can be retrieved 
via SQL commands. 

ORACLE'S Logical hierarchy of structures, as shown in 
Figure 5.5,  lemonstrates the comparative simplicity of this 
system. In this figure, & single arrowhead represents a 
one-to-one relationship while the double arrosheads signify 
on2-to-many relationships. The lataase is divided into 
logical partitions which can only be created or altered by 
the database administrator. When users iefine tables, the 
System allocates memory for one indexspace and one data- 
Space. The indexspace is used by the database/dictionary to 
store information about the table while the jataspace is 
utilized for storing the actual information. As data is 
entered into the database, the system automatically appends 
extents (and pages) as necessary to sipport specific tables. 

ORACLE's 18 data dictionary tables are described in 
Figure 5.6. An example of one of the tables, CATALOG, 


appears in Figure 5.7. Tables with the "SYS" prefix include 
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TABLE (s) | 
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PIES O ORACLE's Logical Hierarchy 


information on system data in addition to the user's data. 
Foc example, a display of SYSCATALOG might appear as Figure 
S5 3. In this particular example, thare are 2) entries, 18 
of which are system tables or views. 

ORACLE'S data dictionary is automatically updated when- 
ever any additions or deletions ara made to th2 database or 
whan Views are defined or user privileges are changed, so it 
always haS a current description of the database. As an 
exanpie, assune a new view, NAVYVIEW, is created using the 
SONMEGHUATEBCONNand:; 

UFI» CREATE VIEW NAVYVIEM AS 
2 SELECT NAME,SSN, RANK 


van 


DTAB 
- Description oz tables & views in Oracle Data 
dactronary 
SYSCATAL23 
=- Profile of tables 5 views accessible to user 
CATALOG . 
- Profile ot tables accessible to user, excluding 
S data dictionary 
- List of tables, views, clusters, and syaonynns 
created by user 
SYSCOLUMNS. 
- Specifications of columns ir accessible tables 
and views 
COLUMNS f i i 
- Specifications of columns in tables (ex2luding 
-— data dictionary) 
- Specifications of columns in tables created 
by the user 
SYSD NDE XG See 
- List of indexes, underlying columns, creator, 
and options 
INDEXES , 
- Indexes created by user £ ini2exes on tabies 
created by user 
SPACES i MM ! 
- Selection of space definitions for creating 
tables 5 clusters 
VIEWS 
- Quotations of the SQL statements upon walch 
Views are based 
SYSTABAUTH i] 
- Directory of access authorization granted by 
or to the user 
EXTENTS na 
- Data structure of extents within tables 
STORAGE 
- Data and Index storage allocations for user's 
own tables 
SYSSTORAGE 
= nr of all database storage -- for DBA 
se on 


u 
SYSUSERAUTH. 
- Master list of Oracle users -- for DBA use only 
SISEXTENTS 
- Data structure of tables throughout system 
=-=- forc DBA use only 
PARTITIONS | a i 
- File structure of files within partitions 
-- foc DBA use only 


Figure 5.6 Tables of the ORACLE Data Dictionary 
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SYSUSERA UTE 


database adminis- 


The 
tionary's 


updated to 
provides the first level of access by entering 


CATALOG table would now appear 
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ORACLE SYSCATALOG Listing 
databasa. 


into the 


have been automatically 
name 


Figure 5.8 
The 
access within the 


ORACLE provides security by using its data dictionary to 
user's 


Upor completion of this dialog, 


files will 
new view. 
LEEsOQrtOr (DBR) 


Figure 5.9. 
the 


control 


aE] a ee NR GERE nolo m —_— ee ee ee ee ii ee 


DABANT PE TABID 
STUDENIS LANDIN TABLE 228609 
ARMYVIEHA ONENS VLEN 268800 
NAVYVIEW LANDIN VIEW 288240 


E 
| NAME CREATOR 
| 
| 


Figure 5.9 ORACLE CATALOG Listing With New View 


table. Initial privileges, or subsequent changes to author- 
ized privileges, are issued using the GRANT or REVOKE 
conmands. JRACLE also supports  nulti-layerel access: in 
addition to privileges authorized by the DBA, a user can 
grant various degrees of access priviləge tə others for 
tables or views which he or she has created. A list of 
current authorizations 1S maintained in the  diictionary's 
SYSTABAUTH vizaw, as shown in Figur2 5.19. 

ORACLE is a strong performer in the data integrity 
categori: Since the data dictionary is an intəgral part of 
the database system, data is only maintained at one location 
within the database. This prevents two users from acguiring 
data from the database ani getting different results. IN 
data were duplicated within the system, it woull be possible 
for one location to be updated while the otne2r was not. 
Figures 5.7 through 5.10 show that the ORACLE user will deal 
mostly with subsets of the databas2, or subschenas. 

ORACLE's documentation is limited to the information 
that can be fFound in the data dictionary tables. It does 
not provide information about which users use which data, 
how often data is used, or when it is used. ORACLE does 
support maintainability through automatic upiate of its 
tables and through the concept of data independence. This 
concept implies a separation of lata definitions from the 


programs or queries that might access the data in the 
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| GRANTOR GRANTEE TABIYPE AJTHORITY | 
| CREATOR TNAME | 
SYSTEM PUBLIC | 
SYSTEM HELP TABLE SELECT | 
SYSTEM PUBLIC | 
SYSTEM DUAL TABLE SELECT | 
SYSTEM PUBLIZ | 
SYSTEM SYSCOLUMNS VIEW SELECT 
| SYSTEM DEIS | 
SYSTEM COLUMNS VIEW SELECT | 
SYSTEM PUBLIC 
SYSTEM SYSCATALOG VIEW SELECT | 
SYSTEM PUBLIZ 
SYSTEM CATALOG VIER SELECT | 
SYSTEM PUBLIC | 
SYSTEM SYSINDEXES VIE SELECT | 
| SYSTEM PUBLIC | 
SYSTEM SYSTABAUTH VIEW SELECT | 
| SYSTEM PUBLIC | 
SYSTEM TAB VIEN SELECT | 
| SYSTEM PUBLIC | 
SYSTEM EXPTAB VIEN SELECT 
| SYSTEM PUBLIC | 
| SYSTEM EXPVIEH VIEW SELECT 
SYSTEM PUBLIC | 
SYSTEM DTAB VIER SELECT | 
LANDIN OWENS | 
| LANDIN STUDENTS TABLE SELECT | 
LANDIN ON ENS ! 
LANDIN STUDENTS TABLE DELETE 
LANDIN ORENS 
LANDIN STUDENTS TABLE UPDATE | 
OWENS OWENS i 
OWENS ARMYVIEW VIE@ DROP 
OWENS OWENS | 
OWENS ARMYVIEW VIEW SELECT | 
LANDIN OWENS | 
LANDIN NAVYVIEW VIEW SELECT 
| 


Figure 5.1) ORACLE SYSTABAUTH Listing for User Owens 
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database. This allows the structures or definitions of tke 


data constructs to be modified without necessitating changes 


in the programs or queries that access the database. Tf m 
table is extensively modified, a view can be created to 
interface with current programs. ORACLE'S data integrity 


will maintain the currency of tae view by automatically 
uplating the view whenever applicable portions of tke 
governing table are modified. | 

ORACLE does  ptovide the basic funrctiors >f đefinitionr, 
update, retrieval, and software interface. dowever, like 
other relational database managment systems with dependent 
data dictionacies, it does not offer the range of functions 
of the other data dictionaries discussed in this chapter, 
noc does it accomplish satisfactorily the thr22 main objec- 
tives of data management discussed in Chapter IV. ORACLE'S 


dat ada cg oua 


provides little more than a nethod of defining the 
Schema.  Th2 relational latabase management system 'dic- 
CoD arises because the system  neeis a wiy to store 
the schema and it does this through the use of the same 
ecto CME as pt uses EOC Cis lara daran aces 
x e a o 


ORACLE could, however, serve as a gool starting point for 


further development. 


The. modern relational DBMS does provide avery good 
basis for a good dictionary system. This is because the 
normal relational DBMS 15 equippel with two features 
that help in making the implementation e2asy:. 

. Many relational DBMS now have a "triagering" feature 
hat. causes a procedure to be invoked on some data 
ondition or event. Such a feature is neeied to tie a 
BMS to a dictionary system. 

: The availability 9f the schema tables substantially 
educes the effort in implementing the dictionary 
ystem. [Ref. 27] 


uA NON T= 


The most important shortcoming of ORACLE's data dictionary 


is its lack of documentation, wibhsuteWwbsch it is Tc: 
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to manage all aspects of an organization's data. IES DIS 
objective wera incorporatełi into the system, ORACLE would be 


a nuch more valuable tool. 


E. COMPARISON OF DATA DESISNER, DATAMANAGER, 
DATADICTIONARY, AND ORACLE 


Now that four representative samples of conmercial data 
dictionaries have been evaluatei, we will compare the 
primary featuces of each and identify which one(s) have come 
closest to providing the features of our ideal system. FOr 
ease of comparison, we have grouped all of the features, 
functions, ani guidelines that have been identified into tke 
Six evaluation criteria categories: System standard schema 
€ extensibility, command and query languages, ease of use 
(including menus), security, documentation ani ceports, and 
application interfaces. 

As the data dictionaries are evaluated in each of the 
six categories, a brief chart will be used to compare each 
dictionary against the FIPS standards. Each chart will 


conpare five data dictionaries: 


FIPS = The ideal/FIPS data dictionary 
MSP = MSP DATAMANAGER 
ADR = ADR DATADICTIONARY 
DDE = DATA DESIGNER 
ORA = ORACLE DBMS/DD 


A very subjective scoring system will be used, with grades 
ranging from three to zero. The ideal/FIPS standard will 
automatically receive a grade of "3" in each area, repre- 
senting the ijeal combination of features. The meaning of 


each grade is as follows: 


PE UCET strong performance by DD; no criticism 

"211 = Good performance by DD; one or more significant 
shortcomings 

"nq" = {3} DD supports functional area very poorly; 

2) DD does not support functional ar2a, but 

another component of the system does. 

"QO" = DD (and remainder of systen) falls to support 
ERUSAAAA IS t 100 
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First, the data dictionary should provide a SYSTEME 
dacd schema aad the capability to add new entities,  reia- 
tionships, and attributes to it. As shown in Table 12, 
while DATADICIIONARY and DATAMANAGER closely resemble the 
ideal system proposed by the FIPS, DATA DESIGNER and, in 
particular, ORACLE fail to provide these capabilities. 
DATAMANAGER supports three "add-on" collections of schema 
descriptors. When added t3 the standard schena, each will 
increase DATAMANAGER's capabilites to support a specific 


application, 2.9., programming. 


Pe 


TABLE 12 
Category One: Schemas ani Extensibility 


Á ja 


| 
OFA | 
| 
| 
| 








Functional Category EIES | {SP ADR | DCE 

| System Stand. Schema | 3 | 3 | 3 | 1 | 0 
Entity ty (10) 7) 29) 2) ?) 
Relat Iur e 7 $ (1 f | 

| ee SE Ee aa (35) | i 3l | 50% | >) | d 

| _DA/User Extensible _ 3 | 3 3 | 0 | o. | 

Category Subtotals | 6 | 6 | 6 | 1 | 0 | 

L IN s 


Second, the data dictionary should proviie a command 
language that will support queries from users while 
reserving some capabilities solely for the use of the 
dictionary alministrator. This last ingredient supports 
security and data integrity. Again, as seen in Table 13, 
DATADICTIONARY and  DATAMANAGER provide all capabilities of 
tha» FIPS standard while the other two lag behini. 
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GE 1 
| | 
TABLE 13 | 
Category Two: Comnmand/Query Languages 
| 





Functional Category FIPS | SSP | ADR | DDE | ORA 

. CMD Interface Lang. | 3 Sea INC 
Query Commands | 3 | 3 |j 3 | 1|! | !' 

- DA-Only Zommands | IS ME S S | 
puc MES SEE Is | 


Third, the ideal data dictionary must be relatively easy 
to use, yet still powerful enough to support tha experienced 
usar. One of the major ingredients of user-friendliness is 
a nenu-driven (or panel-driven) format. Good, easy-to- 
understand examples are another important ail to the new 
usar. Table 14 reveals that, in our opinion, none of the 
four systems can be considered easy to usa. Looking at the 
four as a group, two fail to use nenus, one provides exam- 
ples which ace complex and hard to understand, and the 
fourth fails to provide either menus or good examples. 

Fourth, security is one of the primary objectives of a 
data dictionary. It should not only be abl2 to control 
general access to the system, but should also support the 
capability to provide different levels of access to 
different users. in Table 15, three of the four, 
DATADICTIONARY, DATAMANAGER, and ORACLE receive high marks 
foc providing Doti h spects of security. Security for infor- 
Dation contaiaed within DATA DESIGNER must be provided by 
the parent DBMS. 

Fifth, the clearness and logical layout of system docu- 


mentation should be considered. Additionally, the reports 
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Category Three: Relative Ease of Us2 


(^ 
| TABLE 14 
| 
| 
| 


Functional Category | FIPS | SP | ADR | DOE | ORAÀ 

^ Menu-Driven 3 |. 1.3 | Ta | 
"New user Friendly (| > n 
Good Setup Example | D d. e ON | 
n eel apad oa | 
Category Saptotals A | 


TABLE 15 

category Four: Security 

Functional Category | PIBS | 4SP 
Access Control 





(Passwori) 5 MRS 1 2 
E ses A oce Ai d a | eo 
(Levels) 3 3 3 1 
" DA-only Privileges | 3 iw "a EN 
Category Sabtotals | 9 | 9 | 9» | à^ | 8 


and the documentation prepared by the data dictionary must 
be evaluated for usability. As indicated in Table 16, each 
of the four data dictionaries approaches that of our ideal 
FI?S standard. It is interesting to note that the two 
Frontrunners, DATADICTIONARY and DATAMANAGER, have some 
problems with documentation complexity. 
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P | 

| IABLE 16 | 

| Category Five: Documentation and Reports | 

| Functional Category | FIPS | MSP | ADR | ODE | ORA | 

= SYS Docupentation | ` RE DE | E elu. t3 | 
clear/laid out vell 3 | 2a 2 3 3 

= Good Examples of |  . | . | . | E. E S | 

Report Iypes 3 | | 3 | d | 

| Reports Readable | 3 E MB so yes A 

| category subtotals — | 9 213 43 1.3. | 

BEN 7T i 


Finally, the ideal data dictionary should support a 
variety of applications, jmterfaciig with both DBAS and 
programming languages. DATADESISNER and  DATAMANAGER both 
provide interfaces to one or more DB4S and to two or nore 
programming languages. Table 17 pertains. While DATA 
DESIGNER and ORACLE only interact with their system DBMS, 
DATAMANAGER provides flexibility and vecsatility by 
supporting several popular DBMS. 


ENS ——— 


TABLE 17 | 
Category Six: Application Interfaces | 
| 
| 


o E A O e A AL j ORA 
DBMS Intarface({s) 3 | 3 1 | 1 | 

“Language Interfaces | 3 | 3 1 3. us Tf 

Category Subtotals | 6 | 6 | 5 o 
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When total "scores" ara calculated, the results are as 
Shown in Table 18. While none of the systems provides all 
of the characteristics of the idaal/FIPS system, ADR 
DATADICTIONARY and MSP DATAMANAGER come the closest. If an 


organization were starting "fresh", with no previous invest- 


TABLE 18 
Data Dictionary Comparison Totals 


————————— má 





i Functional Category FIPS | wise ADR DOE ORA 
__Schenas/extensible _ p | 
Conmand/Query Lang. 9 9 9 4 5 | 
at E | 
Security 9 9 9 4 |} 8 
- Documentation/Rpts | 9. BN NOA CA | 
"Application Inter, | 6 | 6 | 5 | 2 | 2 | 
Comparison Totals __ |] 99m n | 


ment in software, the ADR family of products, RIME, warrants 
serious consileration. If, on the other hand, the organiza- 
tion already has one of the popular DBMS, and is simply 
seeking to add a new, or better, data dictionary, the free- 
standing DATAMANAGER might very well satisfy the need. In 
each of these two excellent commercial packages, the 
observed shortcomings lie in the areas of usec friendliness 
and clear examples for new users. Although inportant 
requirements, these faults will be overcome as the users 
gain experience. 

In the case of the other two dictionaries, their short- 


comings would be far harder to forgives. Their problems lie 
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in areas of standard schemas, extensibility, sersurity, etc. 
Each seems more user-friendly, bat, Since they do less, 
there are fewer procedures to be explained. DATA DESIGNER, 
although an interesting package, Simply does not provide 
several of the primary characteristics that we expect to 
find in an id2al data dictionary. RAZLE is certainly the 
weakest of the four dictionaries we evaluated. As part of 
the ORACLE DBMS, this system does provide some data 
dictionary features. However, it is not the full-featured 
data dictionary we would recommend. 
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VI. EXPANSIONS OF IHE ROLE OF DATA DICTIONARIES 

In this chapter we will suggest ways in which the role 
of the data dictionary can be expanded beyond the basic uses 
discussed in previous chapters. He will look first at how 
the data distaionary can enforce standards in today's 
increasingly common distributed data procassinj 2nvironment. 
Then we will show how the process of decision making can be 
supported through the use of a data dictionary. Ia Conci 
sion, we will attempt to foresee where data dictionary tech- 
nology will lead information resource management in the 


years to come. 


A. DISTRIBUTED DATA PROCESSING 


Our discussion of databases up to this point has 
centered around the assumption that an organization has one 
centralized database, with centralized database management 
and control, that would be accessel by all users. Howevet, 
many organizations have decided to distribute computing 
power to various departments and/or. out Hg sites, 
depending on the organization's structure. La. sucias iS 
ation, it is also likely that the organization's database 
will have to be distributed. A listributed latabase is "a 
consistent, logically interrelated collection of data stored 
at dispersed locations" "Recien These dispersed loca- 
tions, called nodes, are connected by means of 4 network 
which allows the nodes to communicate. 

Many factors have contributed to the increasing popu- 
larity of distributed processing. Two of the most important 
are the following:  [Ref. 29] 
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1. Numerous advances in technology that have provided 
more powerful processing hardware at Lower cost ani 
improved communication and network capabilities. 

2. The need for faster and easier access to time- 
COR ocnatron tO assist in the decision 
making g£ organizations with j2ographically 
Gispetsceds Componentes Lequibing unified = information. 
Sharing and processing. (This concept will be 
discussed in detail in the next section.) 

For organizations that employ a centraliz21 approach to 
control widely-dispersed, autonomous livisions, an attempt 
to adhere to the traditional concepts of centralized irfor- 
nation resources may be ineffective and self-defeating. 
Th2se organizations might be tempted to sacrifice the 
ability to better satisfy user neeis in order to preserve 
control and traditional relationships. Fortunately, 
Managers are rapidly becoming awace of the many potential 
advantages of distributing some, or all, of the orgariza- 
tion's data processing functions to the user level. 
Technological advances continue to encourage these changes 


because 


The availability of major computing resources in small, 

low-cost packages allows the dedication and listribution 

of needed c ilities, either standing aloas or inter- 
en 


connected, w and where they are nezled. Many of the 
Somplexiti25 əf Centralized large-scale computing facil- 
ities are no longer necessary. "Ref. 30] 


It is important to remember, however, that 


the complexities of integrated systems regjuice digital 
iata communications, appropriate Software, ani extensive 
planning ani coordination. These complexities should 
not be underestimated. (Ref. 30] 


One very successful corporation, Hewlett-Packard, 


utilizes a combination of centralizei,  dlecentralized, and 
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distributed systems to support a variety >f neels within the 
organization. Corporate planning, employee benefits, and 
establishment of standards are percforned on mainfranes 
located at central  managenent. Daily operations, leta 
processing, and employee pay and records have been decen- 
tralized and ace independently performed ty each division. 
Other functions, e.g., customer saLbeseand support, d havesbesn 


distributed t5 increase responsiveaess and timeliness. 


Successful systems put the control of the data close to 
the source of the information and the sontrol of 
poe close to the manager responsible for the 
unction DEI performel. In an organrzatrope DEM 
Hewlett-Packard, this will frequently, but not always, 
imply distributing . the prosessing. Distributed 
processing has made it possible for us to adapt toa 
constantly expanding Jeographic operation, and a 
constantly changing ,o63aniZational strucne, whiie 
EFE LEE consisten administrative support. 
ef. 


Another class of organization includes ta»se that have 
become so large and dispersed that they simply cannot be 
supported effectively by totally centralized resources. The 
armed services are prime examples of this type. For 


example, 


In an organization as large and, decentralized as the 
Navar it would be impossible and inappropriate to impcse 
centralized control over the thousanis of individual 
small system applications that ace clearly being put to 
productive use. In fact, thẹęẹic main strengzth is their 
EE to solve many, of the inforazation-hahdling prob- 
lems of users at the local level, without tae need for 
zento scd software development and procurement delays. 
| Ref. 


In the years ahead, a growing awaceness of these conditions 
will drive an ever-increasing number of military organiza- 
tions to distribute some portion of theic information 
resource needs. 

Data dictionaries that are designed for operations 
within distributed environments will requir2 all of the 
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capabilites of those operating sol2ly in a centcalized envi- 
ronment. However, amdrstrprbuted data dictionary must 
support three specializei functions in addition to basic 
data dictionary functions; 

1. the ability to locate data witain the nstwork 

2. the coordination/management of distributed data 

3. the ability to perform data c rranstormatror ee 

support of user applications 

it to identify which network node contains the specific 
information that is needed. Whether the particular database 
is distributed by replication or partitioning, the data 
Cictionary must provide information about its logical and 
physical characteristics. 


In the case of replicated data, where functionally iden- 
tical copies of the data are stored at multiple nodes in 
the network, the distributed  2D/05 [lata dictionary] 
"ust have knowledge of the Known redundancies throughout 
the network., Synchronization of upđłates in tais cae is 
a tical:. "Bef. 33] 


In a partitioned database, where only certain portions of 
tha database are located at individual nodes, the data 
diztionary's role becomes even more important because "it 
must know the relationships among the pieces, and be able to 
manage all the parts, such that this physical dispersion of 
the data is transparent to the user" (Ref. 34]. Fina S 
the distributed data dictionary may be required to perform 
transformation of data to support various users. If serving 
a heterogeneous netvwork--one in which dissimilar types of 
hacdware and software coexist--the data dictionary will have 


to translate between different data and storag2 structures. 


[The distributed DD/DS ‘data Sees oe can facilitate 
these translation rocesses S pane ng the metadata 
DPS cOmO the sources o be transformed into the 
target dara EM rIsesccompirshel by stocing in the 
data dictionary the source and target metadata descrip- 
tions to be used by the mapping process. [Ref. 35] 
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It is possible for the distribution 255 distioran ec. 4 
bilities to be accomplished by several alternative configu- 
rationss Ore possible configuration, as mentioned earlier, 
involves duplicating the data dictionary in its entirety at 
each node of the network. An exampl2 of this is shown as 
Fijguce wo. de (Dashed lines indicate node-to-nole communica- 


tions and dotted lines indicate dictionary-to-dictionarv 


D ED AD a A AXES A A O O TO A O O ee ae ee O oe ab ED ee ae PS SP b A O ee A y A A A eA 


=e AED «o «o REED awa Se Se Se ewe 22 O Amo 2 a= umm mene ae =e O A: A ae eee eee cubo O A A eee oe 


WO A A A ARD ew ee ee ee eee ee D D A —e = AU A A A A A A A A ee ee a A a Ame 


--| Network Node === =>=====25 | Network Yode ls 
Lo ————Á es ee oe ee a a eee 
Figuce 6.1 Duplicated Data Dictionaries 
communications.) Each data  daistrowmapvy will “commas 
complete copy of the entire organization's metadata. While 


the nodes themselves wili interact frequently, the various 
copies of the dictionary will not. However, when one copy 
of the dictionary is updated, all other copies nust be auto- 
matically updated if data integrity is to be maintained. 
This duplication of metadata will result in some degree of 
adlitional overhead, but it will improve the responsiveness 
of the system and minimize the necessity otf  inter-data 
dictionary queries. In some implementations, communication 
costs can be significantly reduced. This configuration will 
be most desirable in cases in which the organization's data- 


base(s) are also duplicated at each nole or if nodes would 
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be likely to access each other's metadata often. A stable 
organization with well-establishei data processing, where 
metadata is not continuously being updated, would benefit 
most from this configuration. 


Ton the second configuration, “the data dictionary is 


partitioned anong the various network nodes. AS shown ir 
ENSHre 6272; Cacomnmode contains only Atuat Postion wor the 
Gietronary that contains the metadata it reguices. No one 
-- ) -— 4 —— —— —— «— — -——  — " 4 — — —29. 29000000 -- o c eww www we www ww ww owe i 
--| Network Node SS | Network Node Iz! 
| | | DD Partition Is aaa | DD Partition | | | 
e e. | | 

| ! | 

| | DD Partition INL... :wI DD Partition | | 
| =-| Netvwock Node PnP | Network Noce Iz 
| | 
| 
J 


po. 
| 
i 
| 
| 
| 
i 
' 
i 
i 
! 
! 
i 





Figure 6.2 Partitioned Data Dictionary (DD) 


node or station within the system will have a complete data 
dictionary: This configuration would be used when there is 
not much neei for the nodes of the network t> access each 
other's metadata and there is a relatively cleac-cut differ- 
entiation between the functions being carried on at each 
nole, which implies different metadata. Because redundancy 
is kept to an absolute minimum, problems couli arise iff a 
nolets data dictionary partition were lost uniess good 
bazkup procedures were in effect. Since each node is only 
responsible for maintaining its own portion of the whole, 
there is little update overhead and thus little system delay 
as long as the required metadata 2xists at that particular 


nole. 
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In the final configuration; the data dictionary s 


distributed xa a bierarchizal stru tures Ther2 will be one 
"master" copy of the dictionary and ona or more partial 
Copies throughout the network, as shown in Figure 6.3. Jon 


this configuration, each node that SCogtains a pottonmer stuc 


IS l Network Node === 
] | i DATA DICTIONARY | l 
| ° e o Lu 
| | DD Partition | | DD Partition | | DD Partition | | 
| --] Network node |--] Network Node [--]| Network Node E 


Figure 6.3 Hierarchy of Distributed Data Dictionaries 


data dictionary is responsible for upiatingj the master 
dictionary whenever its portion is modified. This structure 
ensures data integrity and provides flexibility by allowing 
varying amounts of metadata to be listributed. Another use 
for this hierarchical structure might be to separate func- 
ti»nality within a network, e.j., database, automated 
office, andwprtogranmmuug fumstloncs Each of these functions 
is able to maintain its portion of the dictionary locally 
while one master copy is available to handle inter-partition 
queries. 

There are presently several commerical packages in the 
development or testing stages that will be able to satisfy 
the requirements of distributed processing. One system that 


is already available and being used in numerous applications 
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is ADR's Relational Information Management Environment 
(RINE)  systen. As discussed in Chapter V, this system 
features fourteen separate componeats that can 22 integrated 
into one "total" system. One conponent, D-NET, combines a 
database, data dictionary, and communications interfaces to 
Support the special requirements of ilistributai processing. 
D-NET is capable of supporting both homogeneous and hetero- 


geneous networks: 


fhe E provided ay D-NET and the other software 
components IWS users o Configure the distributed 
system networks based on the needs of each node. 
Various operating ena: computer types, and cooper- 
dung SOL ens qu ducts can be used to create a specific 
environment hout EEE sting application leveíopment 
and operations. [ Ref. 


DE T can implement the system's data dictionary, 
DATADICTIONARY, as either one centralized dictionary or as 
multiple copies stored at remot? locations. Sinilarly, 
RIME's database, DATACOM/DB, can be maintained either at one 


centralized location opMEdistri»utel to various nodes 
throughout the network. D-NET serves as the basis of the 
ABBÉy'S project VIABLE, providing numerous benefits that 


include cost effectiveness, highly expandable, increased 
productivity, mesSOlmee Control ind syneohronizatlor, and 


inlependent operation at the local user's level. 


B. | DECISION-MAKING 


In this section we will show how the data dictionary 
provides managers with the efficiently recordei, accurate, 
and timely information necessary to make decisions in conso- 
naace “With thew goals of the organization, whether in a 
centralized or distributed environment. according tothe 
report of th2 [Committee on Review of Navy Long-Range ADP 
Planning, Pinsocnmation technology", which includes data 


dictionaries, is 
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critical to the Navy's ability to culta 
and peacetine roles in an optifMun naaner. The available 
technologies would enable the ya to approach meS 
missions with information and data that (1) have been 
collected and recorded sim ly. (22 have improved accu- 
aen. (3) have been speedily reported, collated and 
listributed (4) lead to summaries that are timely 
to the point, as and when needed, and (5) have enabled 
t Bae ree commitments and costs to be reduced. 
| Ref. 


and 


1. The Decision-Making Process 


Herbert Simon's classic moiel of the decision-making 
process, as cited by Sprague and Carlson [Ref. 38], consists 
of three distinct steps: intelligence, design, and choice. 
The use of a lata dictionacy supports the decision maker as 


he takes each step. 


a. Intelligence involves searching the environment 
foc conditions calling for decisions. Raw data must be 


obtained, processed, and examined for clues taat may iden- 
tify problems. However, so much data is available within an 
orjanization that a seemingly infinite parade of information 
can be produced--this situation 15 called infocmation over- 
load. There must be some way of narrowing JoOwn the amount 
of information that is presented to the lecision maker. A 
data dictionary used ln con junetioa with a database can play 
an important role in this narrowing process. As discussed 
earlier in the thesis, the dictionary helps an organization 
identify and eliminate redundant lata. Its query language 
can be used to select infomation about a particular entity 
and its report definition capability zan be us2i to generate 
agjregate, rather than detailed data. Relationships betwen 
entities are easily identified so that manageacs! questions 
such as "What is the range of values for ‘Readiness Status’ 
data?" and "Which departments receive th2 ‘Ammunition 


Transaction report?" can be answeced. 


IG 


Do Design entails inventing, developing, and 
analyzing possible courses of actions Tais involves 
processes to inderstand the problem, to generate solutions, 
and to test solutions for feasibility. The Jita dictionary 
plays a key role in documenting the decision naker's envi- 
ronment so tnat he or she will have a centralized source of 
information from which to develop possible choices. The 
dictionary can also be used to tailor informition to meet 
specific needs by defining user views of data and 
restricting user access to certain data. In this way, users 
can be presented only with the information they are supposed 
to have and need to have, as detecmined by higher authority 
in the organization, instead of having to deal with non- 
essential information. 

oa to recording  zuformatioa about Athe 
pM structure) and functions of the organization, the 
data dictionary can also be used t5 rezczori information about 
the decision makers themselves. In rne case ot the 0.557 
Constellation, for example, information about tae commanding 
officer and the key elements of his environment can be docu- 
meated: which decisions he wishes to make and which ones 
his subordinates will make, the mission assigned to tne 
carrier by the C.O.'s superiors, the relative priorities he 
attaches to various subjects, his short term and long teru 
pecsonal goals, previous decisions he has made, and so on. 

CITE Sa vol vesmsetectin Y particular course DE 
action from those available and implementing that choice. 
Of course, the ultimate decision will lie with the decision 
maker, and not with the data dictionary. At best, the data 
dictionary can present options to the lecision maker and, 
once the choice is made, can document the steps taken to 


implement that choice. 
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2. Crisis Management 


The accuracy and tineliness o£ information provided 
to the decision maker becomes of critical importance when 
the decision-making process occurs during a crisis situ- 
ation. In wartime, for example, there is usually a great 
deal of risk associated with a decision: many decision 
makers are involved, information must be consolidated from a 
variety of sources and locations, little tim2 is available 
to make decisions, and, (due to the uniqueness of events, 
there is often no pre-defined structure for making the deci- 
Sl2n. There are four ways that the data dictionary can 
prove especially helpful in crisis decision-making. 

a, The dictionary speełs up the information- 
gathering process. As liscussed earlier, usar views ani 
accesses have been pre-defined and can be changed easily as 
needed. Active data dictionari2s provide for automatic 
uplate of any changes that are nade, so information is 
always, Gurrent. 

D: The dictionary prioritizes infornation. The 
priorities of the organization and the decision makers are 
taken into account and can be UWpdited as events occur. Tn 
this way, the attention of decision makers is focused on 
truly important information rather than disp2rsed over a 
wide range of information. 

A The dictionary proviles a common information 
base. This is important when many decision makers at 
different locations are involved. All participants have the 
latest  infornation and can also take advantage of the 
"Corporate memory" provided by the dictionary. 

Gr In short, the dictionary provides "intellzgent" 
information management. It reduzes information overload, 
tailors inforaation to specific decision-nakers' needs, and 
responds well to infrequent, ad hoc requests. It helps to 


establish relationships between events as they occur. 
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eepal; Hor even "ide11", ata dictionary will 
ROD Mabe to fuily support the decision-making process 
Without the help of additional sophisticatei software to 
take advantage of its capabilities. de believe that as tne 


acceptance and use of the data dictionary as a tool for 


information resource management become widespread, the 
denand for an expanded role Por the dictionary will 
increase. Organizations must become more accomplished in 


the top-down planning process of the system development life 
Cyzle in order to receive maximum benefits from data 
dretionary technology. 


C. CONCLUSIONS 


In this thesis, we have discussed the structure, func- 
tions, and objectives of a data dictionary. We have 
conpared popular commercial pcoducts to an "ideal" 
dictionary based on criteria we jeveloped ani on FIPS DDS 
guidelines. We have analyzed the role of a data dictionary 
in information resource management, including 1ts support of 
a distributed data processing environment and of the 
decision-making process. It seems clear that as organiza- 
tions become cognizant of the neel to manage taeir informa- 
tion efficiently, the importance and necessity of data 
dictionary implementation will continue to increase. 

Designers of lata dictionaries are aware of these trends 


and are moving in the following dicections: 


Firsts toward what, is known as an integrated, data 

Hcr d3anuEsecond, toward a fres-standinj dictionary 

that serves as a driver 5f a distributed datà processing 

system made up of several types of computers, data base 

?1e125391* systans, file managers, âad text editors. 
ef. 


In reference to the first projection, several commerciai 
systems have been developed that feature intsjration of a 
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data dictionary with a database. Ine exampl2 of this is 
ADR's RIME which features integration of a database and a 
data dictionacy with numerous oth2r components to form one 
very capable and flexible system. Addressing the second 
projection, Rallo (Ref. 40] foresees development of a "super 
data dictionacy" to support futur? integratei and distrib- 


uted systems; 


In this environment, the data dictionary would act as a 
driver of the systen. Ihe data dictionary/data direc- 
tory might also have some integratsi facilities pernit- 
ting transfer of data, among other  systen  sortware 
functions incluiing itself: Taera is a trend in this 
iirection, with other systems depending oa the data 
dictionary/fiata directory and that system itself begin- 
ning to resemble a model of the 2nterprise. 


We believe the future holds significant improvements and 
expansions of data dictionary technology. It is important 
that the development of standards for data dictionary 
compatibility continue along with the development of stan- 
dards that are currently being developed to support network 
communications. It is conceivable that these standards, if 
widely accepted, would allow any data dictionary to "talk" 
to another and to exchange information. The FI2S ODS stan- 
dacds developed by the National 3ureau of Standards will 
most likely become the basis for data dictionaries procured 
and used by the federal government. 

We also foresee the use of fourth generation languages, 
the extremely user-friendly, "zlosə-to-natacal-language" 
languages that will facilitate user access to the diction- 
acy's metadata. These languages will replace the formal 
command languages and awkward syntax described earlier in 
the thesis. Another factor contributing to tne increased 
utility of data dictionaries will be the use of sophisti- 
cated softwace and artificial intelligence techniques in 


conjunction vititne dictionary: As the central source of 
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Ceea bout an Seganization, the Jata dictionary contains a 
broad base of information upon vaica an artificial intelli- 
gence "expert" system can be built. For example, it is 
possible that an expert system would be able to verify and 
validate additions to the dictionary schema based on pre- 
determined rules and information gainei from previous manip- 
ulations of the schema. It would also be able to establish 
associations between the contents of the iata dictionary ani 
flag them or the attention of the decision maker. In addi- 
tion, a "smact" data dictionary would be abla to "realize" 
that every time a user logs on to the system, he asks for 
pumpIcular Tatormation, so that eventually, the data 
diztionary will provide it for him automatically. 

No matter what changes occur in data dictionary tech- 
nology, the Jata dictionary's rola in the effizient manage- 
meat of an organization's information resource will continue 
to be an increasingly important one. Tae  ILtionary will 
support the 2orcganization in its planning and analysis of 
functions, its development of infocmation systens, the main- 
tenance of those systems, and thea intelligent use of those 
systems. We believe that the military will soon provide a 
vast market for data dictionary software and that tne 
dezands of its users will drive data dictionary technology 


even further. 


BACKUS-NAUR FORM 


Backus-Naur form is a graphic notation for describing 
th» -syntax Of a Nia uis It is Used by the Fedəral 
(FIPS DDS) to show the fornat of tae commands ised to uanip- 
ulate the dictionary. The following are coumoa Backus-Naur 
synbols used by the FIPS DDS: 


< > denotes a word or phrase 
i indicates a choice between two or nore alterna- 


tives, "or" 


[ Jj represents an option that the user man or may not 
include 
( ) is used to set off choices separated by "|" and 


to enclose the format of the commaad 


The syntax for the ADD-ENTIIY command appears as 


follows: 


ADD-ENTITY 
{[OF] {ENTITY-TYPE | E-T} <entity-type-name> 
WHERE NAME *1S] <name-clause> 
[WHERE (ATTRIBUTE | A) PEOSUISSStEPEDULSZC 40s 1> 
(Gees, | attcibite-clancse- nage 
WITH SECURITY <security-clause> }} 


It indicates that there ar? several different ways of adding 
an entity to the dictionary. At a minimum, the command must 
include ENTITY-IYPE or E-T, an entity -tipe nans, "HACES 
and a name clause. The words OF and IS are optional, as are 
the last two phrases set off by brackets. IC the phrases 
are used, th2 same rules hold for choosing elements within 
them. 
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10. 


nu 


Ze 


13. 


14. 
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